0% found this document useful (0 votes)
19 views52 pages

Vaishnavidocumentation

Uploaded by

Swathi Rani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views52 pages

Vaishnavidocumentation

Uploaded by

Swathi Rani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

CREDIT CARD FRAUD DETECTION

A MINI PROJECT REPORT

Submitted in Partial fulfilment of the requirements

For the award of the degree of

MASTER OF SCIENCE

IN

COMPUTER SCIENCE

Submitted by

VAISHNAVI A(22SPCS39)

Under the Guidance of

Dr.T.S.URMILA MCA,M.Phil,Ph.D

DEPARTMENT OF COMPUTER SCIENCE

THIAGARAJAR COLLEGE (AUTONOMOUS)


(Affiliated to Madurai Kamaraj University)

Re-Accredited with “A++ Grade” by NAAC

MADURAI-625009

NOVEMBER-2023
CREDIT CARD FRAUD DETECTION
A MINI PROJECT REPORT

Submitted in Partial fulfilment of the requirements

For the award of the degree of

MASTER OF SCIENCE

IN

COMPUTER SCIENCE

Submitted by

VAISHNAVI A(22SPCS39)

Under the Guidance of

Dr.T.S.URMILA MCA,M.Phil,Ph.D

DEPARTMENT OF COMPUTER SCIENCE

THIAGARAJAR COLLEGE (AUTONOMOUS)


(Affiliated to Madurai Kamaraj University)

Re-Accredited with “A++ Grade” by NAAC

MADURAI-625009

NOVEMBER-2023
THIAGARAJAR COLLEGE (AUTONOMOUS),

MADURAI-625009

(Affiliated to Madurai Kamaraj University)

Re-Accredited with “A++ Grade” by NAAC

DEPARTMENT OF COMPUTER SCIENCE

BONAFIDE CERTIFICATE

This is to certify that the mini project entitled as “CREDIT CARD


FRAUD DETECTION” for the course PCS20MP31-Mini Project & Viva Voce
is a bonafide record of the work done by VAISHNAVI A(22SPCS39)
submitted in partial fulfilment for the award of the Degree of Master of
Science in Computer Science, during the period July 2023 to November
2023.

Submitted for the viva-voce held on___________

Project Guide Head of the Department

External Examiner
VAISHNAVI A(22SPCS39)
M.Sc., Computer Science,
Department of Computer Science,
Thiagarajar College (Autonomous),
Madurai-625009.

DECLARATION

Hereby, I declare that the mini project work entitled as “CREDIT


CARD FRAUD DETECTION” is submitted in partial fulfilment for the award

of the degree of Master of Science in Computer science. This is the record


of original work done by me during the period of the project work.

Place: Madurai VAISHNAVI A

Date: (22SPCS39)
ACKNOWLEDGEMENT
ACKNOWLEDGEMENT

First and foremost, I am Grateful to thank the Almighty Lord who


has showered his blessings on me, all through my life and Wisdom for
providing the ability to use knowledge in bringing this mini project.

It is indeed an immense pleasure to thank the personalities who have


contributed towards the successful completion of this mini project.

I take much pleasure in acknowledging deep sense of gratitude and


indebtedness to Our Principal Dr. D. PANDIARAJA M.Sc., M.Phil.,
PGDCA., B.Ed., Ph.D., for giving an opportunity to carry out the mini
project work.

I would like to express my profound gratitude for the support offered


by our Head of the Department of Computer Science (Self Finance)
Dr. G. RAKESH M.C.A., M.Phil., Ph.D. I am much thankful to him
who supported me and encouraged me in this mini project work.

I express my hearty thanks to Dr.T.S.URMILA MCA,M.Phil,Ph.D


for her continuous and valuable guidance throughout the mini project
period.

I thank all the faculty members of Computer Science Department for their
Cooperation. Above all, I am grateful to thank in personal to my family and
friends for their continuous encouragements.
CONTENTS
CONTENTS

S.NO TITLE PAGE.NO


1 INTRODUCTION 1
1.1 ABSTRACT 2
1.2 PROJECT DESCRIPTION 3
1.3 MODULE DESCRIPTION 4
2 SYSTEM ANALYSIS 6
2.1 EXISTING SYSTEM 7
2.2 PROPOSED SYSTEM 8
3 SYSTEM CONFIGURATION 9
3.1 HARDWARE REQUIREMENTS 10
3.2 SOFTWARE REQUIREMENTS 10
3.3 SOFTWARE SPECIFICATION 11
4 SYSTEM DESIGN 14
4.1 INPUT DESIGN 15
4.2 OUTPUT DESIGN 16
4.3 LIST OF DIAGRAMS 17
5 SYSTEM TESTING 18
6 SYSTEM IMPLEMENTATION 22
7 SAMPLES 25
7.1 SAMPLE CODING 26
7.2 SCREEN SHOTS 33
8 CONCLUSION 39
9 FUTURE ENHANCEMENT 41
10 BIBLIOGRAPHY 42
10.1 BOOK REFERNCE 43
10.2 WEB LINKS 44
INTRODUCTION

1
1.INTRODUCTION

1.1 ABSTRACT

Credit card transaction fraud loss billions of dollars to card issuers every year. A
well-developed fraud detection system with a state-of-the-art fraud detection model is regarded
as essential to reducing fraud losses. The Detection of fraudulent transactions has become a
significant factor affecting the greater utilization of electronic payment. The main contribution
of the work is the development of a fraud detection system that employs a machine learning
architecture together with an advanced feature engineering process. To demonstrate the
effectiveness of the proposed system for detecting fraud in credit card transactions,
experiments were performed using real-world public credit card transaction data sets (Credit
Card Fraud Dataset) consisting of fraudulent transactions and legitimate ones. Implementing
the different Machine Learning Algorithms Such as Light Gradient Boost and Random Forest
Algorithms. The managerial implication of this work is that credit card issuers can apply the
proposed methodology to efficiently identify fraudulent transactions to protect customers’
interests and reduce fraud losses and regulatory costs.

2
1.2 PROJECT DESCRIPTION

The main contribution of the work is the development of a fraud detection system that
employs a machine learning architecture together with an advanced feature engineering process.
The managerial implication of this work is that credit card issuers can apply the proposed
methodology to efficiently identify fraudulent transactions to protect customers’ interests and
reduce fraud losses and regulatory costs.

3
1.3 MODULE DESCRIPTION

LIST OF MODULES

• Data preprocessing

• Model Creation

• Performance Evaluation

Data Preprocessing:

• Data pre-processing module is the process of removing the unwanted data from the dataset.

• Missing data removal: In this process, the null values such as missing values and Nan values
are replaced by 0.

• Encoding Categorical data: That categorical data is defined as variables with a finite set of
label values.

Model Creation:

Data Splitting:

• In this process , The dataset is divided into train dataset and test dataset

• The partitioning available data into two portions, usually for cross-validator purposes.

• One Portion of the data is used to develop a predictive model and the other to evaluate the
model’s performance.

Classification:

 Light GBM is a gradient boosting framework based on decision trees to increases the
efficiency of the model and reduces memory usage . It uses two novel techniques: sampling
and exclusive.

4
 Random forest or random decision forests are an ensemble learning method for
classification, regression and other tasks that operate by constructing a multitude of decision
trees at training time and outputting the class that is the mode of the classes (classification)
or mean/average prediction (regression) of the individual trees.

Prediction:

 After implementing the classification algorithms , getting some predicted values based on
testing data.

 In the Prediction module, the credit card is either fraud or non fraud is predicted.

Performance Evaluation:

• The Final Result will get generated based on the overall classification and prediction. The
performance of this proposed approach is evaluated using some measures like,

• Accuracy
• Precision
• Recall
• F1-score

• The result is generated in graph that compares and predicts both algorithms , which
algorithm provide more accuracy.

5
SYSTEM ANALYSIS

6
2.SYSTEM ANALYSIS

2.1 EXISTING SYSTEM

 The large-scale use of credit cards and the lack of effective security systems result in billion-
dollar losses to credit card fraud.
 In Existing System, Machine learning methods, including Support Vector Machine (SVM)
and Decision Tree are used.
 The existing system doesn’t effectively classify and predict the fault in credit card detection.

DISADVANTAGES:

 It doesn’t efficient for large volume of data.

 Time consuming is High.

 Prediction Accuracy is less.

 Bad performance on high noise.

7
2.2 PROPOSED SYSTEM

 This project proposes an approach for detecting fraudulent credit card transactions that uses
Machine Learning algorithms like, Light Gradient Boost Machine and Random Forest.

 The proposed method can identify relatively more fraudulent transactions than the existing
methods under an acceptable false positive rate.

 The managerial implication of the work is that credit card issuers can apply the methodology
to efficiently identify fraudulent transactions to protect customers interests and reduce
fraud losses and regulatory costs.

ADVANTAGES:

 Faster training Speed.


 Higher efficiency.
 Time Consumption is low.
 Prediction accuracy is high.
 Lower memory usage.
 Reduce overfitting.
 Handle large datasets efficiently.

8
SYSTEM CONFIGURATION

9
3.SYSTEM CONFIGURATION

3.1 HARDWARE REQIREMENTS

 System : Pentium IV 2.4 GHz


 Hard Disk : 200 GB
 Ram : 4GB

3.2 SOFTWARE REQUIREMENTS


 Operating system - Windows 11
 Language - Python
 IDE - Anaconda Navigator-Spyder

10
3.3 SOFTWARE SPECIFICATION

A Software Requirements Specification (SRS) is a document that describes what


the software will do and how it will be expected to perform. It also describes the
functionality the product needs to fulfil all stakeholders (business, users) needs.

Python:

Python is one of those rare languages which can claim to be both simple and powerful.
You will find yourself pleasantly surprised to see how easy it is to concentrate on the solution to
the problem rather than the syntax and structure of the language you are programming in. The
official introduction to Python is Python is an easy to learn, powerful programming language. It
has efficient high-level data structures and a simple but effective approach to object-oriented
programming. Python’s elegant syntax and dynamic typing, together with its interpreted nature,
make it an ideal language for scripting and rapid application development in many areas on most
platforms. I will discuss most of these features in more detail in the next section.

Features of Python
 Simple

Python is a simple and minimalistic language. Reading a good Python program feels
almost like reading English, although very strict English! This pseudo-code nature of Python is
one of its greatest strengths. It allows you to concentrate on the solution to the problem rather than
the language itself.

 Easy to Learn

As you will see, Python is extremely easy to get started with. Python has an extraordinarily
simple syntax, as already mentioned.

 Free and Open Source

Python is an example of a FLOSS (Free/Libre and Open Source Software). In simple


terms, you can freely distribute copies of this software, read its source code, make changes to it,
11
and use pieces of it in new free programs. FLOSS is based on the concept of a community which
shares knowledge. This is one of the reasons why Python is so good – it has been created and is
constantly improved by a community who just want to see a better Python.

 High-level Language

When you write programs in Python, you never need to bother about the low-level details such as
managing the memory used by your program, etc.

 Portable

Due to its open-source nature, Python has been ported to (i.e. changed to make it work on) many
platforms. All your Python programs can work on any of these platforms without requiring any changes
at all if you are careful enough to avoid any system-dependent features.

You can use Python on GNU/Linux, Windows, FreeBSD, Macintosh, Solaris, OS/2, Amiga,
AROS, AS/400, BeOS, OS/390, z/OS, Palm OS, QNX, VMS, Psion, Acorn RISC OS, VxWorks,
PlayStation, Sharp Zaur us, Windows CE and Pocket PC!

You can even use a platform like Kivy to create games for your computer and for iPhone, iPad,
and Android.

 Interpreted

This requires a bit of explanation.

A program written in a compiled language like C or C++ is converted from the source language
i.e. C or C++ into a language that is spoken by your computer (binary code i.e. 0s and 1s) using a compiler
with various flags and options. When you run the program, the linker/loader software copies the program
from hard disk to memory and starts running it.

Python, on the other hand, does not need compilation to binary. You just run the program directly from
the source code. Internally, Python converts the source code into an intermediate form called bytecodes
and then translates this into the native language of your computer and then runs it. All this, actually,
makes using Python much easier since you don’t have to worry about compiling the program, making

12
sure that the proper libraries are linked and loaded, etc. This also makes your Python programs much
more portable, since you can just copy your Python program onto another computer and it just works!

 Object Oriented

Python supports procedure-oriented programming as well as object-oriented programming. In


procedure-oriented languages, the program is built around procedures or functions which are nothing but
reusable pieces of programs. In object-oriented languages, the program is built around objects which
combine data and functionality. Python has a very powerful but simplistic way of doing OOP, especially
when compared to big languages like C++ or Java.

 Extensible

If you need a critical piece of code to run very fast or want to have some piece of algorithm not
to be open, you can code that part of your program in C or C++ and then use it from your Python program.

 Embeddable

You can embed Python within your C/C++ programs to give scripting capabilities for your
program’s users.

 Extensive Libraries

The Python Standard Library is huge indeed. It can help you do various things involving regular
expressions, documentation generation, unit testing, threading, databases, web browsers, CGI, FTP,
email, XML, XML-RPC, HTML, WAV files, cryptography, GUI (graphical user interfaces), and other
system-dependent stuff. Remember, all this is always available wherever Python is installed. This is
called the Batteries Included philosophy of Python.

13
SYSTEM DESIGN

14
4.SYSTEM DESIGN

4.1 INPUT DESIGN

The input design is the link between the information system and the user. It comprises
the developing specification and procedures for data preparation and those steps are necessary
to put transaction data in to a usable form for processing can be achieved by inspecting the
computer to read data from a written or printed document or it can occur by having people keying
the data directly into the system. The design of input focuses on controlling the amount of input
required, controlling the errors, avoiding delay, avoiding extra steps and keeping the process
simple. The input is designed in such a way so that it provides security and ease of use with
retaining the privacy. Input Design considered the following things:

 What data should be given as input?


 How the data should be arranged or coded?
 The dialog to guide the operating personnel in providing input.
 Methods for preparing input validations and steps to follow when error occur.

15
4.2 OUTPUT DESIGN

A quality output is one, which meets the requirements of the end user and presents
the information clearly. In any system results of processing are communicated to the users and
to other system through outputs. In output design it is determined how the information is to be
displaced for immediate need and also the hard copy output. It is the most important and direct
source information to the user. Efficient and intelligent output design improves the system’s
relationship to help user decision-making.

1. Designing computer output should proceed in an organized, well thought out manner; the right
output must be developed while ensuring that each output element is designed so that people
will find the system can use easily and effectively. When analysis design computer output, they
should Identify the specific output that is needed to meet the requirements.
2. Select methods for presenting information.
3. Create document, report, or other formats that contain information produced by the system.

The output form of an information system should accomplish one or more of the
following objectives.

a. Convey information about past activities, current status or projections of the Future.
b. Signal important events, opportunities, problems, or warnings.
c. Trigger an action.
d. Confirm an action.

16
4.3 LIST OF DIAGRAMS

DATA FLOW DIAGRAM

A data flow diagram shows the way information flows through a process or system. Here some
symbols and its meanings of data flow diagram. Here the data flow diagram

Handling missing
values

Data
Input data
Preprocessing Label Encoding

Credit Drop unwanted


Card Fraud columns
Testing
Dataset Data
Data Splitting

Training
Data

Light GBM

Classification

Random
Forest

Prediction

Performance
evaluation

17
SYSTEM TESTING

18
5.SYSTEM TESTING

SYSTEM TESTING

System testing is the stage of implementation, which aimed at ensuring that system
works accurately and efficiently before the live operation commence. Testing is the
process of executing a program with the intent of finding an error. A good test case is one
that has a high probability of finding an error. A successful test is one that answers a yet
undiscovered error.
Testing is vital to the success of the system. System testing makes a logical
assumption that if all parts of the system are correct, the goal will be successfully achieved.
. A series of tests are performed before the system is ready for the user acceptance testing.
Any engineered product can be tested in one of the following ways. Knowing the specified
function that a product has been designed to from, test can be conducted to demonstrate
each function is fully operational. Knowing the internal working of a product, tests can be
conducted to ensure that “al gears mesh”, that is the internal operation of the product
performs according to the specification and all internal components have been adequately
exercised.

UNIT TESTING:

Unit testing is the testing of each module and the integration of the overall
system is done. Unit testing becomes verification efforts on the smallest unit of software
design in the module. This is also known as ‘module testing’.

The modules of the system are tested separately. This testing is carried out during
the programming itself. In this testing step, each model is found to be working
satisfactorily as regard to the expected output from the module. There are some validation
checks for the fields. For example, the validation check is done for verifying the data
given by the user where both format and validity of the data entered is included. It is very
easy to find error and debug the system.

19
INTEGRATION TESTING:

Data can be lost across an interface, one module can have an adverse effect on
the other sub function, when combined, may not produce the desired major function.
Integrated testing is systematic testing that can be done with sample data. The need for
the integrated test is to find the overall system performance. There are two types of
integration testing. They are:

 Top-down integration testing.


 Bottom-up integration testing.

WHITE BOX TESTING:

White Box testing is a test case design method that uses the control structure of the
procedural design to drive cases. Using the white box testing methods, We Derived test
cases that guarantee that all independent paths within a module have been exercised at
least once.

BLACK BOX TESTING:

 Black box testing is done to find incorrect or missing function


 Interface error
 Errors in external database access
 Performance errors.
 Initialization and termination errors

In ‘functional testing’, is performed to validate an application conforms to its


specifications of correctly performs all its required functions. So this testing is also
called ‘black box testing’. It tests the external behaviour of the system. Here the
engineered product can be tested knowing the specified function that a product has been
designed to perform, tests can be conducted to demonstrate that each function is fully
operational.

20
VALIDATION TESTING:

After the culmination of black box testing, software is completed assembly as a


package, interfacing errors have been uncovered and corrected and final series of
software validation tests begin validation testing can be defined as many,

But a single definition is that validation succeeds when the software functions in a manner
that can be reasonably expected by the customer

OUTPUT TESTING:

After performing the validation testing, the next step is output asking the user
about the format required testing of the proposed system, since no system could be
useful if it does not produce the required output in the specific format. The output
displayed or generated by the system under consideration. Here the output format is
considered in two ways. One is screen and the other is printed format. The output
format on the screen is found to be correct as the format was designed in the system
phase according to the user needs. For the hard copy also output comes out as the
specified requirements by the user. Hence the output testing does not result in any
connection in the system.

21
SYSTEM IMPLEMENTATION

22
6.SYSTEM IMPLEMENTATION

Implementation is the stage in the project where the theoretical design is


turned into a working system. The most critical stage is achieving a successful system
and in giving confidence on the new system for the users, what it will work efficient
and effectively. It involves careful planning, investing of the current system, and its
constraints on implementation, design of methods to achieve the change over methods.

The implementation process begins with preparing a plan for the


implementation of the system. According to this plan, the activities are to be carried
out in these plans; discussion has been made regarding the equipment, resources and
how to test activities.

The coding step translates a detail design representation into a programming


language realization. Programming languages are vehicles for communication
between human and computers programming language characteristics and coding
style can profoundly affect software quality and maintainability. The coding is done
with the following characteristics in mind.

 Ease of design to code translation.


 Code efficiency.
 Memory efficiency.
 Maintainability.

The user should be very careful while implementing a project to ensure what
they have planned is properly implemented. The user should not change the purpose
of project while implementing. The user should not go in a roundabout way to achieve
a solution; it should be direct, crisp and clear and up to the point.

23
Implementation is the stage of the project when the theoretical design is
turned out into a working system. Thus it can be considered to be the most critical stage
in achieving a successful new system and in giving the user, confidence that the new
system will work and be effective.

The implementation stage involves careful planning, investigation of the


existing system and it’s constraints on implementation, designing of methods to achieve
changeover and evaluation of changeover methods.

24
SAMPLES

25
7.SAMPLES

7.1 SAMPLE CODING

Data Preprocessing

"""Import the libraries"""

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

from sklearn.model_selection import train_test_split

from lightgbm import LGBMClassifier

from sklearn.ensemble import RandomForestClassifier

from sklearn.metrics import classification_report,accuracy_score, confusion_matrix

import scikitplot as skplt

import warnings

warnings.filterwarnings("ignore")

"""Input Data Read"""

data = pd.read_csv('creditcard_1.csv')

data.head()

data.info()

26
"""Preprocessing"""

"""Checking missing values in the data"""

print()

print("Checking Missing Values")

print(data.isnull().sum())

data.describe()

"""Data visualization"""

"""number of fraud and valid transactions """

count_classes = pd.value_counts(data['Class'], sort = True)

count_classes.plot(kind = 'bar', rot=0)

plt.title("Transaction Distribution")

plt.xlabel("Class")

plt.ylabel("Frequency");

"""Assigning the transaction class "0 = NORMAL & 1 = FRAUD"""

Normal = data[data['Class']==0]

Fraud = data[data['Class']==1]

print()

print("Outlier Fraction:", len(Fraud)/float(len(Normal)))

print()

print("Fraud Cases : {}".format(len(Fraud)))

27
print("Valid Cases : {}".format(len(Normal)))

"""Splitting train and test data"""

X = data.iloc[:,:-1]

y = data.iloc[:,-1]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20, random_state = 0)

LGBM Classifier

"""LGBM Classifier"""

lgbm =LGBMClassifier() lgbm.fit(X_train,y_train)

lgbm_pred=lgbm.predict(X_test)

print('\n')

print("------Accuracy------")

lgbm_=accuracy_score(y_test, lgbm_pred)*100

LGBM=('Light Gradient Boosting Accuracy is:',lgbm_,'%')

print(LGBM)

print('\n')

print("------Classification Report------")

print(classification_report(lgbm_pred,y_test))

print('\n')

print('Confusion_matrix')

lgbm_cm = confusion_matrix(y_test, lgbm_pred)

28
print(lgbm_cm)

print('\n')

tn = lgbm_cm[0][0]

fp = lgbm_cm[0][1]

fn = lgbm_cm[1][0]

tp = lgbm_cm[1][1]

Total_TP_FP=lgbm_cm[0][0]+lgbm_cm[0][1]

Total_FN_TN=lgbm_cm[1][0]+lgbm_cm[1][1]

specificity = tn / (tn+fp)

lgbm_specificity=format(specificity,'.3f')

print('RF_specificity:',lgbm_specificity)

print()

plt.figure()

skplt.estimators.plot_learning_curve(LGBMClassifier(), X_train, y_train,cv=7,


shuffle=True,scoring="accuracy",n_jobs=1,figsize=(6,4),title_fontsize="large",text_
fontsize="large", title="Light Gradient Boosting Digits Classification Learning
Curve");

plt.figure()

sns.heatmap(confusion_matrix(y_test,lgbm_pred),annot = True)

plt.title("Confusion Matrix")

plt.xlabel("Predicted")

29
plt.ylabel("True")

plt.show()

Random Forest Classifier

'''RANDOM FOREST'''

rf_clf=RandomForestClassifier(n_estimators=10)

rf_clf.fit(X_train,y_train)

rf_ypred=rf_clf.predict(X_test)

print('\n')

print("------Accuracy------")

rf=accuracy_score(y_test, rf_ypred)*100

RF=('RANDOM FOREST Accuracy:',accuracy_score(y_test, rf_ypred)*100,'%')

print(RF)

print('\n')

print("------Classification Report------")

print(classification_report(rf_ypred,y_test))

print('\n')

print('Confusion_matrix')

rf_cm = confusion_matrix(y_test, rf_ypred)

print(rf_cm)

print('\n')

30
tn = rf_cm[0][0]

fp = rf_cm[0][1]

fn = rf_cm[1][0]

tp = rf_cm[1][1]

Total_TP_FP=rf_cm[0][0]+rf_cm[0][1]

Total_FN_TN=rf_cm[1][0]+rf_cm[1][1]

specificity = tn / (tn+fp)

rf_specificity=format(specificity,'.3f')

sensitivity = tp / (fn + tp)

rf_sensitivity=format(sensitivity,'.3f')

print('RF_specificity:',rf_specificity)

print('\n')

plt.figure()

skplt.estimators.plot_learning_curve(RandomForestClassifier(n_estimators=10),
X_train, y_train, cv=7, shuffle=True, scoring="accuracy", n_jobs=-1, figsize=(6,4),
title_fontsize="large", text_fontsize="large",title="Random Forest Digits
Classification Learning Curve");

plt.figure()

sns.heatmap(confusion_matrix(y_test,rf_ypred),annot = True)

plt.title("Confusion Matrix")

plt.xlabel("Predicted")

31
plt.ylabel("True")

plt.show()

#comparision

vals=[lgbm_,rf]

inds=range(len(vals))

labels=["LGBM ","RF"]

fig,ax = plt.subplots()

rects = ax.bar(inds, vals)

ax.set_xticks([ind for ind in inds])

ax.set_xticklabels(labels)

plt.show()

for i in range(0,10):

if rf_ypred[i]==0:

print([i],"The Credit Card = NORMAL")

else:

print([i],"The Credit Card = FRAUD")

32
7.2 SCREEN SHOTS

DATA PREPROCESSING:

In below figure preprocessing the dataset to check null values

In below figure preprocessing the dataset to check missing values

33
DATA SPLITTING:

In this figure splitting the dataset into train set and test set

IMPLEMENTING ALGORITHMS:

In below figure implementing LGBM and Random Forest algorithms

LGBM:

Confusion Matrix:

34
Comparison Graph:

In the below graph shows the number of fraud classes and normal classes

Learning Curves for LGBM:

35
Random Forest:

Confusion Matrix:

Learning Curves for Random Forest:

36
COMPARING THE ALGORITHMS:

In this figure Comparing both algorithms and shows in bar chart and
shows the credit cards are normal or fraud

37
CONCLUSION

38
8.CONCLUSION

The detection of credit card fraud is significant to the improved utilization


of credits cards. With large and continuing financial losses being experienced by
financial firms and given the increasing difficulty of detecting credit card fraud, it is
important to develop more effective approaches for detecting fraudulent credit card
transactions .The result showed that the classifier was able to predict fraudulent
transactions with an accuracy of 98%.Evaluate the results using different metrics such
as recall,precision,F1 Score, and learning curves to detect both fraud and not fraud
classes with a high accuracy level.

39
FUTURE ENHANCEMENT

40
9.FUTURE ENHANCEMENT

In future, it is possible to provide extensions or modifications to the proposed


optimization and classification algorithms using intelligent agents to achieve further
increased performance. To carry out further research from two aspects. The first
focuses on researching the computational demand of a real-time fraud detection
system. The second is to explore the application of more advanced machine learning
methods and possible combinations of deep learning methods and traditional data
mining methods in fraud detection.

41
BIBLIOGRAPHY

42
10.BIBLIOGRAPHY

10.1 BOOK REFERENCE

1. X. Zhang, Y. Han, W. Xu, and Q. Wang, ‘‘HOBA: A novel feature engineering


methodology for credit card fraud detection with a deep learning architecture,’’ Inf.
Sci., May 2019. Accessed: Jan. 8, 2019. N. Carneiro, G. Figueira, and M. Costa, ‘‘A
data mining based system for credit-card fraud detection in e-tail,’’ Decis. Support
Syst., vol. 95, pp. 91–101, Mar. 2017.
2. B. Lebichot, Y.-A. Le Borgne, L. He-Guelton, F. Oblé, and G. Bontempi, ‘‘Deep-
learning domain adaptation techniques for credit cards fraud detection,’’ in Proc. INNS
Big Data Deep Learn. Conference, Genoa, Italy, 2019, pp. 78–88.
3. H. John and S. Naaz, ‘‘Credit card fraud detection using local outlier factor and isolation
forest,’’ Int. J. Comput. Sci. Eng., vol. 7, no. 4, pp. 1060–1064, Sep. 2019.
4. V. Zaslavsky and A. Strizhak, ‘‘Credit card fraud detection using self organizing
maps,’’ Inf. Secur., vol. 18, p. 48, Jan. 2006.
5. U. Fiore, A. D. Santis, F. Perla, P. Zanetti, and F. Palmieri, ‘‘Using generative
adversarial networks for improving classification effectiveness in credit card fraud
detection,’’ Inf. Sci., vol. 479, pp. 448–455, Apr. 2019.

43
10.2 WEB LINKS

1. https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud

2. http://www.w3schools.blog/detection-and-prevention-of-fraud

3. https://github.com/topics/credit-card-fraud-detection

44

You might also like