0% found this document useful (0 votes)
13 views5 pages

Redit Card Fraud Detection Using Machine Learning as Data Mining Technique

Uploaded by

giftingfor1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views5 pages

Redit Card Fraud Detection Using Machine Learning as Data Mining Technique

Uploaded by

giftingfor1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Credit Card Fraud Detection Using Machine

Learning As Data Mining Technique


Ong Shu Yee, Saravanan Sagadevan and Nurul Hashimah Ahamed Hassain Malim
School of Computer Sciences, Universiti Sains Malaysia, Penang, Malaysia.
[email protected]

Abstract—The rapid participation in online based detect credit card fraud activities.
transactional activities raises the fraudulent cases all over the Data mining is known as the process of gaining interesting,
world and causes tremendous losses to the individuals and novel and insightful patterns as well as discovering
financial industry. Although there are many criminal activities understandable, descriptive and predictive models from large
occurring in financial industry, credit card fraudulent activities
are among the most prevalent and worried about by online
scale of data collections [5, 6]. The ability of data mining
customers. Thus, countering the fraud activities through data techniques to extract fruitful information from large scale of
mining and machine learning is one of the prominent data using statistical and mathematical techniques would
approaches introduced by scholars intending to prevent the assist credit card fraud detection based on differentiating the
losses caused by these illegal acts. Primarily, data mining characteristics of common and suspicious credit card
techniques were employed to study the patterns and transactions. While data mining focused on discovering
characteristics of suspicious and non-suspicious transactions valuable intelligence, machine learning is rooted in learning
based on normalized and anomalies data. On the other hand, the intelligence and developing its own model for the purpose
machine learning (ML) techniques were employed to predict the of classification, clustering or so on.
suspicious and non-suspicious transactions automatically by
using classifiers. Therefore, the combination of machine
The application of machine learning techniques spreads
learning and data mining techniques were able to identify the widely throughout computer sciences domains such as spam
genuine and non-genuine transactions by learning the patterns filtering, web searching, ad placement, recommender
of the data. This paper discusses the supervised based systems, credit scoring, drug design, fraud detection, stock
classification using Bayesian network classifiers namely K2, trading, and many other applications. Machine Learning
Tree Augmented Naïve Bayes (TAN), and Naïve Bayes, logistics classifiers operate by building a model from example inputs
and J48 classifiers. After preprocessing the dataset using and using that to make predictions or decisions, rather than
normalization and Principal Component Analysis, all the following strictly static program instructions. There are many
classifiers achieved more than 95.0% accuracy compared to different types of machine learning approaches available with
results attained before preprocessing the dataset.
the intentions to solve heterogeneous problems. Due to the
Index Terms—Credit Card; Data Mining; Fraud Detection; nature of this study which was focused on classification, the
Machine Learning. discussion that follows is based on this topic. Machine
learning classification refers to the process of learning to
I. INTRODUCTION assign instances to predefined classes. Formally, there are
several types of learning such as supervised, semi-supervised,
According to Global Payments Report 2015, credit card is the unsupervised, reinforcement, transduction and learning to
highest used payment method globally in 2014 compared to learn [7]. As the interest of this study was to conduct
other methods such as e-wallet and Bank Transfer [1]. The supervised based machine learning classification, the
huge transactional services are often eyed by cyber criminals discussions about the rest of the methods are discarded from
to conduct fraudulent activities using the credit card services. further elaboration. In most classification studies, supervised-
Credit card fraud is defined as the unauthorized usage of card, based learning is favoured more than other methods due to
unusual transaction behavior, or transactions on an inactive the ability to control the classes of the instances with the
card [2]. In general, there are three categories of credit card interventions of human. In supervised learning, the classes of
fraud namely, conventional frauds (e.g. stolen, fake and the instances would be labeled prior to feeding into
counterfeit), online frauds (e.g. false/fake merchant sites), classifiers. Then, by using certain evaluation metrics, the
and merchant related frauds (e.g. merchant collusion and performances of the classifiers could be measured.
triangulation) [3]. In the case of credit card fraud detection, the binary
In the past couple of the years, credit card breaches have classification technique was employed due to the instances
been trending alarmingly. According to Nilson Report, the labeled as fraud and non-fraud. The inputs were transformed
global credit card fraud losses reached $16.31 billion in 2014 as Boolean x = (x1,…, xj), where xj = 1, if the jth
and it is estimated that it will exceed $35 billion in 2020 [4]. characteristics appeared in the instances, but otherwise, xj =
Therefore, it is necessary to develop credit card fraud 0. A classifier input a training set into (xi, yi), where xi = (xi,
detection techniques as the counter measure to combat illegal . . . , xq) was an observed input and yi was the corresponding
activities. In general, credit card fraud detection has been output of the classifier. The rest of the paper is organized into
known as the process of identifying whether transactions are background studies, research methodology, results,
genuine or fraudulent. As the data mining and machine discussions and conclusions.
learning techniques are vastly used to counter cyber-criminal
cases, scholars often embraced those approaches to study and

e-ISSN: 2289-8131 Vol. 10 No. 1-4 23


Journal of Telecommunication, Electronic and Computer Engineering

II. BACKGROUND STUDIES probability of root node. Basically, a Bayesian Network A=


<N, B, Ѳ>, is a directed acyclic graph that consists a set of
Data mining and machine learning are popular methods to random variables, where, DAC= <N, B>, and each node n ∈
study and combat the credit card fraud cases. There is a large N represents the variable of the data. Each arc a ∈ A in
number of studies that exploited the strength of data mining between nodes represents probability dependency. Bayesian
and machine learning to prevent the credit card fraudulent network is able to compute the conditional probability of a
activities. Based on Self-Organizing Map and Neural node based on given values assigned to other nodes. There
Network, the study of [8] obtained Receiver Operating Curve are several advantages of Bayesian Network such as the
(ROC) over 95.00% of fraud cases without false alarms rate. ability to handle incomplete inputs, the learning of causal
The Hidden Markov Model (HMM) also has been applied in relationship and so on [17]. As illustrated in Figure 1, there
credit card fraud detection with low percentage of false alarm are minor differences between Naïve Bayes, TAN and
rates [9]. However, transition process of different states and general framework of Bayesian Network. Naïve Bayes is a
calculating the probability in HMM are very costly and very popular classifier as it is simple, efficient and yields
intensive. Furthermore, rather than using single classifiers, better performance in solving real world problems. Naïve
some of the credit card fraud detection studies used meta- Bayes is a probabilistic classifier based on Bayes rules with
learning learners based on supervised learning. Stolfo et al. strong independent assumptions. In simple term, a descriptive
investigated credit card fraud detection system using four "independent feature model" based on probability will allow
types of algorithms namely Iterative Dichotomiser 3 (ID3), NB to make assumptions that the presence or absence of a
Classification and Regression Tree (CART), Ripper and peculiar feature of a class is not related to the presence of
Bayes as base learners and tested with heterogeneous data absence of other features. K2 as one of Bayesian type
distributions [10]. Based on 50% / 50% distribution of classifiers used scoring functions to compute the joint
instances (fraud and non-fraud), the study found that meta- probability of any instantiation of all the variables in a belief
learning using Bayes as a base learner obtained a higher true network as the product of probabilities [18]. In WEKA, K2
positive rate compared to other meta learners. However, even classifiers used hill climbing methods in order to develop the
though the distribution of 50% / 50% yields good results, it Bayesian beliefs. On the other hand, TAN classifier used
does not reflect real world circumstances where genuine Bayesian scoring function to develop the Bayesian Belief. As
credit card transactions are quite higher than non-legitimate illustrated in Figure 1, TAN classifier allows arcs between the
transactions. Researchers have also tested other types of meta children of the classification node xc. Therefore, the TAN
learning classifiers such as Adaboost, Logitboost, Bagging classifier is able to compute the probability from each child
and Dagging and yielded interesting outcomes [11]. and eventually identify the appropriate classes of the children
Through our literature studies, Bayesian Network is one of based on computed probability. Although the information
the classifier types that have been widely applied to detect channeled by TAN looks better than Naïve Bayes, none of the
fraud in the credit card industry. Maes et al examined the true studies found to be investigating the performances of TAN on
positive and false positive produced by Bayesian Belief credit card fraud detection domain. Then, compared to Naïve
Network and Artificial Neural Network on classifying credit Bayes as generative model, Logistics as discriminative
card fraud instances. The study found that Bayesian network classifier predicts the probability using direct Bayes
performed approximately 8% higher than Artificial Neural Functional Form. Logistics uses conditional probability and
Network and claimed that the former's classifier processing iterative based estimation in order to estimate the classes of
time is shorter than the latter [12]. Rather than analyzing the instances. J48 is an open source Java implementation in
using traditional classification methods, the investigation by WEKA based on C4.5 algorithm. C4.5 algorithm was
[13] initiated to perform cost sensitive credit card fraud developed by Ross Quinlan to generate the decision tree
detection based on Bayes Minimum Risk technique. The based on a set of labeled input data [19]. J48 is a predictive
study measured the performances of Logistic Regression machine-learning classifier that determines the target values
(LR), C4.5 and Random Forest (RF). The study showed that of new samples based on various attribute values in the data.
adjusting the probabilities of Bayes Minimum Risk classifier The internal decision tree nodes represent the different
on RF classification yielded consistently better results than attributes or features while the branches between nodes
LR and C4.5. denote the possible or viable attributes that could be included
Throughout our observation and analysis of previous in the observed samples or classes. The terminal nodes depict
studies, Bayesian Network classifiers have become one of the the final classification attributes of the target value.
popular classifier types that are widely used to classify credit
card fraud data. Therefore, this study attempted to investigate
the classification by several Bayesian classifiers such as K2,
Tree Augmented Naïve Bayes (TAN), and Naïve Bayes.
Moreover, this study also measured the performances of
Logistics Regression and J48 based on the proposed
methodology. A brief discussion about Bayesian Network
Classifier and proposed classifiers are stated below. Figure 1: Illustration of Naïve Bayes, TAN and General BN structures

A. Bayesian Network Classifier III. RESEARCH HYPOTHESIS


Bayesian Network is a threshold-based model that
computes the sum of the output accumulated from child Based on the review from past studies, two main
nodes. The reasons behind the creation of such model is the conclusions are made on the evaluation of credit card fraud
ability of child nodes to operate independently without detection investigations. The first conclusion is that credit
interrupting other child nodes and particularly influence the card data plays essential roles in identifying fraudulent and

24 e-ISSN: 2289-8131 Vol. 10 No. 1-4


Credit Card Fraud Detection Using Machine Learning As Data Mining Technique

non-fraudulent characteristics. However, the process of


getting the real credit card fraud related data is very hard due
to record privacy and sensitivity. Therefore, as to mimic the
real data, the authors of this study used a dummy data created
based on manipulating certain features that were expected to
have significant impact for fraud detection. For instance, if
the customer entered a wrong pin number from an actual or
shipping address that was different than billing address or
transaction date and time that were too close with large sum
of transactions from previous actions, it could be suspected
as suspicious affairs. Furthermore, some countries such as
Yugoslavia, Lithuania, and Pakistan have a very high number
of fraud incidents with unverifiable addresses. Based on such
indicators, the data was developed using several attributes
such as credit card number, reference number, terminal id,
actual pin, entered pin, transaction amount, transaction date
and time, location, billing address and shipping address.
Those attributes were the common variables that were used Figure 2: A simple illustration on the flow of methodology in this work
to study the credit card fraud activities. The data was
developed manually using spreadsheet and GNU auto data In the classification process, a prominent data mining and
generation script derived from generatedata.com. The machine learning tool namely WEKA was used in order to
instances were labeled as fraud based on the presence of measure the performances of the classifiers. WEKA is one the
correlations among the attributes as stated in Table 1. The rest open source prominent tools that is used widely to study
of the correlation was defined as non-fraud. many real world problems such as sentiment analysis,
personality detection, spam filtering, and fraud detection. The
Table 1 classification was run using 10-fold cross validation
Rules to Labeling Fraud Instances (T=TRUE, F=FALSE) techniques. The 10-fold cross validation technique is widely
applied in data mining and machine learning studies due to
Time Interval
Similar Pin

Blacklisted

the training and testing process that occurred on the entire


Threshold

Address
country

Similar
Exceed

Fraud
Case

dataset. Through 10-fold cross validation, the dataset was


splitted into ten parts, each part was held out in turn, and
eventually the average results were computed. In other words,
1 T T T T T T
2 T T T T F T
each data point in the dataset was used once for testing and 9
3 T T T F F T times for training. Then, in order to measure the performances
4 T T F T F T of the classifiers, several evaluation metrics were employed
5 T T F F F T in this study. Primarily, the output of the metrics depended on
6 T F T T T T
7 T F T F F T
the results obtained by True Positive (TP), True Negative
8 T F T T F T (TN), False Positive (FP) and False Negative (FN). TP refers
9 T/F T/F T/F T/F T/F T to the number of fraud transactions predicted as fraud while
FP is the number of legal transactions predicted as fraud. TN
In order to evaluate the validity of the dummy data, the first refers to the number of fraud transactions predicted as legal
experiment was conducted to verify the authentication of the transactions while FN is the number of legal transactions
corpus to be used in the credit card fraud detection. The predicted as fraud. This study evaluated the performances of
second conclusion is most of the previous studies attempted the classifiers using True Positive Rate (TPR), False Positive
to use heterogeneous types of the classifiers to measure the Rate (FPR), Precision, Recall, F-Measure, and accuracy.. The
performances on detecting genuine and non-genuine description and formula for each evaluation metrics are
transactions. On the intention to contribute further to body of defined in Table 2.
the knowledge, the second experiment was conducted to
evaluate the performances of the proposed classifiers in the Table 2
classification of credit card fraud activities. Therefore, the The Formula of Metrics Used in the Study
first and second hypotheses that reflect the former and latter Metric Formula and Description
experiments are stated as follows: True Positive Rates (TPR) TPR = TP / (TP + FN)
Hypothesis (1) : The dummy dataset that was created
False Positive Rates (FPR) FPR = FP / (FP + TN)
based on suspicious behaviors can be used for classification
in data mining. Precision Precision = TP / (TP + FP)
Hypothesis (2) : The performances on the dataset which Recall Recall = TP / (TP + FN)
undergo data preprocessing are better than the raw dataset. F-Measure F-Measure = 2TP / (2TP + FP + FN)
Accuracy Accuracy = (TP + TN) / (TP + TN +
IV. RESEARCH METHODOLOGY FP + FN)

The overview of the research methodology illustrated in The following paragraphs will elaborate on data
Figure 2. transformation and data reduction. Generally, data
transformation and data reduction are referred to as data pre-
processing phase, where the raw data is cleaned and

e-ISSN: 2289-8131 Vol. 10 No. 1-4 25


Journal of Telecommunication, Electronic and Computer Engineering

transformed into appropriate forms (or standardization) to be Table 3


Results of Classification Using Raw Dummy Dataset
evaluated and fed into machine learners. Data transformation
process involves activities such as normalization, smoothing, Naïve
aggregation, attributes construction and generalization of the Metric K2 TAN Logistic J48
Bayesian
data. While data reduction is to reduce the number of True Positive Rate 31.0 50.3 75.0 60.3 73.0
attributes such as data cube aggregation, removing irrelevant (%)
False Positive Rate 69.0 49.7 25.0 39.7 27.0
attributes and principle component analysis. For instance, (%)
during data transformation, the format of transaction date and Precision (%) 21.0 45.7 73.0 44.7 69.4
time were standardized into a uniform state so that it was Recall (%) 32.0 60.3 75.0 47.8 67.5
identical to machine learners to interpret it as date and time F-Measure (%) 32.2 34.3 68.5 44.9 67.4
Processing Speed 10.0 10.0 56.0 25.0 84.0
attributes. Then, Principal Component Analysis technique (seconds)
was employed to detect the anomaly transactions. Principal Accuracy 41.8 53.7 84.0 67.3 80.0
Component Analysis is a method to transform the correlated
variables into a smaller number of uncorrelated attributes B. Results and Analysis for Experiment 2
called Principal Components. The objective of applying the The second experiment used the data that was filtered with
method was to identify and reduce the dimensionality of the normalization and Principal Component Analysis. From
dataset and discover new meaningful underlying attributes. Experiment 2, all the five classifiers showed better results
The advantage of Principal Component Analysis is during compared to Experiment 1. All the classifiers achieved
reducing the dimensions of the data using eigenvector, the accuracy more than 95.0% with better processing speed than
losses to the information of the data are insignificant. Experiment 1. The minimal FPR showed the preprocessing
Furthermore, the losses could be trace back by decompressing techniques employed by this study which had increased the
the eigenvalue. reliability of the data by removing the unusable attributes.
The results of J48 and Logistics showed that both classifiers
V. RESULTS & EVALUATION gained maximum strengths upon preprocessing of the dataset.
It is a huge classification improvement showed by K2
This study used two datasets to run through the compared to the previous experiment. The classifiers
experiments. The raw dataset and the new dataset were achieved almost 195.80% increase of TPR after data
created by data transformation and data reduction. transformation and data reduction process. Furthermore,
besides the improvement to the TPR, precision, recall, F-
A. Results and Analysis for Experiment 1 Measure and accuracy, the processing speed for all the
For Experiment 1, the raw dummy dataset was used to classifiers also improved significantly compared to the
evaluate the integrity of the data for credit card fraud previous experiment. The authors were curious about
detection. The result (see Table 3) showed that the TPR attributes that were removed during data preprocessing,
(75.0%), precision (73.0%), recall (75.00%), F-Measure hence the cleaned dataset was observed. During the
(68.5%) and accuracy (84.0%) of TAN are the highest among observation, the authors noticed that the terminal_id
the classifiers on the evaluations. The minimal FPR rate of attributes were reduced significantly. Based on the results
TAN showed the ability of TAN to process the raw data better shown in Table 4, the hypothesis of experiment 2 was proven
than other classifiers even though the classifier's speed was where the performances of the classifiers on the preprocessed
higher than K2, Naïve Bayesian, and Logistics. This could be dataset are better than the raw dataset after undergoing data
due to the heavy processes such as finding the probability and preprocessing tasks.
creating the tree model which caused the processing of the
data to take too long. The J48 that was also based on tree Table 4
model as TAN achieved TPR (73.0%), precision (69.4%), Results of Classification Using Transformed Dataset
recall (67.5%), and F-Measure (67.4%) which were slightly Naïve
lower than TAN. Moreover, the processing speed for J48 was Metric K2 TAN Logistic J48
Bayesian
also slower than TAN although the processes involved in True Positive 91.7 99.6 99.7 100.0 100.0
latter classifier were more heavy/costly than former classifier. Rate (%)
From the point of views of the authors, the False Positive 8.3 0.4 0.3 0.0 0.0
Rate (%)
underperformances of Logistics, Naïve Bayesian and K2 Precision (%) 92.6 95.6 98.4 100.0 100.0
showed that the raw data with high number of noises affects Recall (%) 91.7 99.6 99.6 100.0 100.0
the modeling and evaluation of the raw data. As the worst F-Measure (%) 95.7 89.3 99.0 100.0 100.0
performer, K2 even obtained very poor results in terms of Processing Speed 2.0 2.0 30.0 5.0 32.0
(seconds)
TPR (31.0%), precision (21.0%), recall (32.0%), F-Measure Accuracy 95.8 96.7 99.7 100.0 100.0
(32.2%) and accuracy (41.8%). Even though some of the
classifiers obtained poor results, the ability of the learners to C. Discussion and Future Work
classify the data showed the reliability of the dummy data The detection of credit card fraud using data mining and
being used to test the credit card fraud detection. To further Machine Learning techniques have become one of the
improve the classification results, in experiment 2, the raw reliable approaches to counter this illegal activity. However,
dummy dataset was fed into data transformation and data the process to gather real time credit card fraud data is very
reduction techniques as mentioned above. hard. Therefore, to mimic the real data, the development of
dummy data may assist the detection process. However, the
creation and credibility of dummy data must be ascertained
prior to conducting the classification processes. Based on the
results from Experiment 1, the credibility of the data could be

26 e-ISSN: 2289-8131 Vol. 10 No. 1-4


Credit Card Fraud Detection Using Machine Learning As Data Mining Technique

ensured by noticing the ability of the WEKA to produce non- support to this work.
zero results. Generally, WEKA would not be able to process
the data if the data is highly unstructured and would return REFERENCES
N/A (Not Applicable) results, errors, or freeze during
[1] WorldPay. (2015, Nov). Global payments report preview: your
modeling process. However, it did not happen to our dummy
definitive guide to the world of online payments. Retrieved September
dataset. Furthermore, the development of the dummy dataset 28, 2016, from http://offers.worldpayglobal.com/rs/850-JOA-
was based on attributes commonly used for credit card fraud 856/images/GlobalPaymentsReportNov2015.pdf.
detection and created automatically by using GNU data [2] Federal Trade Commision. (2008). consumer sentinel network - data
book for January - December 2008. Retrieved Oct 20, 2016. From
generation scripts. Then, as always emphasized by many data
https://www.ftc.gov/.
mining researchers, the preprocessing of raw dataset is an [3] Bhatla, T.P., Prabhu, V., and Dua, A. (2003). understanding credit card
essential factor to improve the classification results. This has frauds. Crads Business Review# 2003-1, Tata Consultancy Services.
been proven by observing the differences between results of [4] The Nilson Report. (2015). Global fraud losses reach $16.31 Billion.
Edition: July 2015, Issue 1068.
Experiment 1 and Experiment 2. The improvement on
[5] Y. Sahin and E. Duman, “Detecting credit card fraud by decision trees
Experiment 2 after data transformation and data reduction and support vector machines”, Proceedings of the International Multi-
significantly improve the classification performances. As Conference of Engineers and Computer Scientists 2011 Vol I, IMECS
mentioned earlier, the strength of Principal Component 2011, March 2011.
[6] Elkan, C. (2001). Magical thinking in data mining: lessons from COIL
Analysis that reduced the dimensionality, losing much the
challenge 2000. Proc. of SIGKDD01, 426-431.
information from the attributes was one of the major factor [7] Mohammed, J. Zaki., & Wagner, Meira Jr. (2014). Data mining and
that improved the classification process. Therefore, we analysis: fundamental concepts and algorithms. Cambridge University
believed that Principal Component Analysis technique is the Press. ISBN 978-0-521-76633-3.
[8] F. N. Ogwueleka. (2011). Data mining application in credit card fraud
better filtering approach to be considered and to be used in
detection system. Journal of Engineering Science and Technology, Vol.
credit card fraud detection processes. Then, our classification 6, No. 3 (2011) 311 - 322.
process also proved that Bayesian based classifiers such as [9] V. Bhusari & S. Patil. (2011). Application of hidden markov model in
K2, Naïve Bayesian, Tan, Logistics and J48 were able to credit card fraud detection. International Journal of Distributed and
Parallel Systems (IJDPS) Vol.2, No.6.
classify and predict the credit card fraud activities better if the
[10] S.J. Stolfo, D.W. Fan, W. Lee, A.L. Prodromidis, and P.K. Chan.
data was preprocessed using reliable filtering techniques. (1998). Credit card fraud detection using meta-learning: issues and
Moreover, after the dimensionality of the raw data was initial results, Proc. AAAI Workshop AI Methods in Fraud and Risk
reduced by using Principal Component Analysis, the authors Management, pp. 83-90.
[11] Sen, Sanjay Kumar., & Dash, Sujatha. (2013). Meta learning
of this study found that the terminal_id attributes were largely
algorithms for credit card fraud detection. International Journal of
reduced.. Therefore, we made the assumptions that Engineering Research and Development Volume 6, Issue 6, pp. 16-20.
terminal_id information contribute less to the credit card [12] Maes, Sam, Tuyls Karl, Vanschoenwinkel Bram & Manderick,
fraud detection. However, the investigation of credit card Bernard. (2002). Credit card fraud detection using bayesian and neural
networks. Proc. of 1st NAISO Congress on Neuro Fuzzy Technologies.
hacking based on physical methods (e.g. hardware stressing)
Hawana.
has to use terminal_id attributes as the reference to identify [13] A.C. Bahnsen, Aleksandar, Stojanovic., D. Aouada & Bjorn, Ottersten.
the illegal activity. (2013). Cost sensitive credit card fraud detection using bayes minimum
In the future, this study will attempt to explore more credit risk. 12th International Conference on Machine Learning and
Applications.
card fraud detections using real time data. Then, since the
[14] Amlan Kundu, Suvasini Panigrahi, Shamik Sural and Arun K.
Bayesian Networks classifiers showed better results, the Majumdar. (2009). Credit card fraud detection: a fusion approach
comparisons with other types of classifiers such as using dempster–shafer theory and bayesian learning. Special Issue
Hyperplane based may contribute further to the body of the on Information Fusion in Computer Security, Vol. 10, Issue No. 4,
pp.354-363.
knowledge.
[15] Lam, Bacchus (1994). Learning bayesian belief networks: an approach
based on the MDL principle. Computational Intelligence, Vol. 10, Issue
VI. CONCLUSION No. 3, pp.269–293.
[16] M. Mehdi, S. Zair, A. Anou and M. Bensebti (2007). A bayesian
networks in intrusion detection systems. International Journal of
This paper tested classification metrics by using five
Computational Intelligence Research, Issue No. 1, pp.0973-1873 Vol.
Bayesian classifiers namely Naïve Bayes, K2, TAN, 3.
Logistics and J48. The evaluations conducted using two [17] R.Najafi & Afsharchi, Mohsen. (2012). Network intrusion detection
datasets, where, the first dataset was a dummy dataset that using tree augmented naive-bayes. The Third International Conference
on Contemporary Issues in Computer and Information Sciences (CICI)
represented the characteristics of credit card data and a newly
2012.
transformed dataset using data normalization and Principal [18] G. Cooper, E. Herskovits (1992). A bayesian method for the induction
Component Analysis techniques. Overall, all the Bayesian of probabilistic networks from data. Machine Learning. 9(4):309-347.
classifiers achieved significantly better results after being fed [19] Quinlan, J. R. (1993). C4.5: Programs for Machine Learning. Morgan
Kaufmann Publishers.
with filtered data.
[20] Friedman, N. and Goldszmidt, M. (1996). Building classifiers using
bayesian networks. Proc. 13th National Conference on Artificial
ACKNOWLEDGMENT Intelligence.Vol. 2, pp 1277-1284.
[21] Friedman, N., Geiger, D. and Goldszmidt, M. (1997). Bayesian
network classifiers. machine learning,Vol. 29, pp 131-163. Kluwer
We are grateful to Universiti Sains Malaysia for providing
Academic Publishers, Boston.

e-ISSN: 2289-8131 Vol. 10 No. 1-4 27

You might also like