0% found this document useful (0 votes)

2 views17 pages

Major Project

This document outlines a project focused on developing an accurate and interpretable credit risk assessment model using machine learning and deep learning techniques. The model aims to classify loan applicants into four risk categories, leveraging both internal bank data and external credit data while addressing challenges such as class imbalance and model transparency. The project emphasizes the importance of improving prediction accuracy to minimize financial losses for lending institutions.

Uploaded by

shindeaarya26

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views17 pages

Major Project

Uploaded by

shindeaarya26

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 17

An Approach to Credit Risk Assessment through Multimodal Data Fusion

Yash Khandagale, Manish Kapal, Arya Shinde, Sanika Wani

BE Sem VII- 2024-25

Guide
Prof.Avinash Gondal

Department Of Computer Engineering

Watumull Institute Of Electronics And Technology
01 Introduction

02 Motivation

03 Literature
Review
04 Objective

05 Problem
Statement
Conte 06 Proposed
Architecture
nts 07 Dataset and Technology

08 Proposed Methodology

09 Gantt
Chart
10 Conclusion

1
1 Reference
Introduction

• Credit risk modeling is a critical function for financial institutions, helping them assess the
likelihood of loan applicants defaulting.
• The ability to predict an applicant’s risk accurately helps banks and other lending
institutions minimize non-performing loans (NPLs) and optimize loan approval decisions.
• This project focuses on classifying loan applicants into risk categories (P1-P4) using
advanced machine learning (ML) and deep learning (DL) techniques, based on internal bank
data and external credit data (e.g., CIBIL).

Department Of Computer Engineering,Watumull

Institute,
Motivation

The financial industry faces significant risks due to loan defaults. Misclassification of
credit risk can lead to substantial financial losses, as high-risk customers may be
approved for loans while low-risk customers are denied. Traditional credit scoring models
often struggle with complex, non-linear data and do not account for dynamic changes in
borrower behavior. By applying machine learning and deep learning models, we can
improve the accuracy and robustness of credit risk predictions, helping banks make better
decisions, minimize defaults, and optimize their lending processes.

Department Of Computer Engineering,Watumull

Institute,
Literature Review

• Credit risk prediction has evolved from traditional models like Logistic
Regression to more advanced methods such as XGBoost and Random Forest,
which offer better accuracy but still face challenges with class imbalance and
transparency.
• Techniques like SMOTE improve data balance, and tools like SHAP value
enhance model interpretability. While deep learning shows potential, its
complexity limits practical use.

Department Of Computer Engineering,Watumull

Institute,
Sr. No. Title Abstract Limitations Future Scope
1 Credit Risk Prediction using This paper presents a model that combines Genetic Algorithms for feature Computational complexity and Enhance scalability and optimize Genetic
Genetic Algorithm and Stacking selection with stacking ensemble methods, achieving high accuracy for credit scalability issues when applied to Algorithm efficiency for larger datasets
Ensemble risk prediction by optimizing feature selection and model integration. larger datasets. and real-time applications.

2 Explainable AI Using H2O AutoML This paper focuses on applying H2O’s AutoML for model tuning and uses Lacks detailed performance metrics Incorporate class imbalance techniques
and Robustness Check in Credit SHAP values for interpretability. It emphasizes the need for transparency in and does not fully address class and provide a more comprehensive
Risk Management credit risk models to ensure regulatory compliance while maintaining high imbalance. evaluation with detailed metrics.
accuracy.

3 Customer Credit Risk: Application A comparative analysis of machine learning (SVM, Random Forest) and deep SVM struggles with large datasets, and Explore hybrid models combining ML
and Evaluation of Machine learning models (DNN, CNN) for credit risk prediction. SVM and XGBoost are complex models may suffer from and DL techniques to improve
Learning and Deep Learning identified as the top-performing algorithms in terms of accuracy. overfitting without proper tuning. performance on large and imbalanced
Models datasets.

4 Optimizing Loan Approval This paper uses ensemble methods to optimize loan approval decisions, Challenges in deploying stacking Test deep learning methods and improve
Decisions providing insights on different machine learning algorithms and their ensembles in real-time systems, with the deployment of ensemble models in
performance in predicting credit risk categories. limited exploration of deep learning real-time credit risk environments.
models.

5 Financial Risk Prediction Using Provides an extensive comparison of various machine learning algorithms Insufficient handling of class imbalance Integrate interpretability tools (like
Machine Learning (Logistic Regression, Decision Trees, XGBoost) for financial risk prediction. It and lack of model interpretability. SHAP) and better address class
focuses on practical implementation in the financial industry. imbalance through resampling
techniques.

6 Novel Approach to Optimize Proposes an enhanced SVM model for credit risk prediction, achieving high Limited dataset size raises concerns Apply the model to larger datasets and
Credit Risk Prediction accuracy (96.92%) and targeted improvements to the algorithm. about generalizability, and struggles enhance its handling of class imbalance
with imbalanced data were noted. using techniques like SMOTE.

7 An Automatic Deep This paper introduces a Deep-Q Network (reinforcement learning) approach Low accuracy (~10%) due to class Improve accuracy and scalability by
Reinforcement Learning Based for credit scoring. It focuses on adaptive learning and real-time feedback for imbalance and complex tuning tuning the Deep-Q network and
Credit Scoring Model using Deep- improved credit risk classification. requirements, limiting practical incorporating data balancing techniques.
Q application.

8 Credit Risk Assessment Explores the use of HistGradientBoosting for credit risk assessment, achieving Needs further validation in real-world Test model robustness in real-world
high accuracy (89.08%) and precision (94.54%) while efficiently handling large scenarios, with a risk of overfitting if environments and apply stronger
datasets. not properly regularized. regularization to avoid overfitting.
Objective

The objective of this project is to develop an accurate, scalable, and interpretable credit
risk prediction model that categorizes loan applicants into four risk categories: P1 (lowest
risk) to P4 (highest risk).
The model will leverage machine learning and deep learning techniques, combining both
internal bank data (loan accounts, payment histories) and external credit data (CIBIL
scores).
The primary goals are:
• To minimize financial losses by accurately predicting high-risk applicants.
• To ensure model interpretability for regulatory compliance.
• To balance model performance and computation time, making it scalable for real-world
deployment.

Department Of Computer Engineering,Watumull

Institute,
Problem Statement
Financial institutions face challenges in predicting the risk of loan default accurately, as
traditional models struggle to handle large volumes of data, class imbalance, and complex
relationships between borrower characteristics.
Inaccurate credit risk classification can lead to either loan defaults or missed lending
opportunities.
Therefore, the problem is to build a robust machine learning model that:
1. Predicts the likelihood of default and classifies applicants into risk categories.
2. Handles imbalanced datasets, ensuring minority high-risk categories are accurately
identified.
3. Provides model transparency and interpretability for regulatory compliance.

Department Of Computer Engineering,Watumull

Institute,
Proposed Architecture
The architecture for this project consists of several key stages:
1. Data Collection: Collect internal bank data (loan, account history,
etc.) and external data (CIBIL).
2. Data Preprocessing:
• Handle missing values.
• Encode categorical features and scale numerical ones.
• Apply feature selection using Genetic Algorithms or Principal
Component Analysis (PCA).
3. Data Balancing: Address class imbalance using SMOTE.
4. Model Training:
• Train multiple models: Decision Tree, Random Forest, and
XGBoost.
• Perform cross-validation and hyperparameter tuning.
5. Evaluation and Explainability:
• Evaluate the model using metrics like Accuracy, Precision,
Recall, and AUC-ROC.
• Implement explainability using SHAP values.
6. Deployment: Deploy the best-performing model for real-time loan
approval predictions.

Department Of Computer Engineering, Watumull

Institute.
Dataset and Technology

Department Of Computer Engineering,Watumull

Institute,
Dataset and Technology
Technology:-
• Programming Language: Python
• Libraries/Frameworks: -
• Scikit-learn: For machine learning algorithms.
• XGBoost: For implementing gradient boosting models.
• Pandas & NumPy: For data preprocessing and manipulation.
• H2O AutoML: For hyperparameter tuning and model selection.
• SHAP: For model interpretability
• Hardware:
• Intel i7 processor (or equivalent).
• 16GB RAM.
• Optional: GPU (for faster training of deep learning models).
Department Of Computer Engineering,Watumull
Institute,
Proposed Methodology
1. Data Preprocessing:
• Handle missing values through imputation.
• Encode categorical variables using One-Hot Encoding.
• Scale numerical features to ensure uniformity across different ranges (e.g., loan
amounts, credit scores).
2. Feature Selection:
• Use Genetic Algorithms (GA) to reduce dimensionality and identify key predictive
features.
3. Model Selection:
• Train multiple machine learning models:
• Decision Tree: For simplicity and interpretability.
• Random Forest: For improved accuracy and robustness.
• XGBoost: For high accuracy and efficiency in handling imbalanced data.
Department Of Computer Engineering,Watumull
Institute,
Proposed Methodology
4. Evaluation:
• Use metrics such as Accuracy, Precision, Recall, F1-Score, and AUC-ROC to evaluate
model performance.
• Perform cross-validation to ensure the model generalizes well to unseen data.
5. Explainability:
• Implement SHAP values to provide interpretability for model predictions, ensuring
compliance with financial regulations.
6. Deployment:
• Deploy the best-performing model to the bank’s loan approval system for real-time
predictions.

Department Of Computer Engineering,Watumull

Institute,
Gantt Chart

Department Of Computer Engineering,Watumull

Institute,
Conclusion

• This project presents a comprehensive approach to credit risk modeling, leveraging both
internal and external datasets. By applying machine learning algorithms like XGBoost and
ensemble methods like Random Forest, the project aims to provide a robust solution for
classifying loan applicants into different risk categories.
• The model will not only improve prediction accuracy but will also provide interpretability
using SHAP, ensuring that the system meets regulatory requirements and delivers clear insights
into the decision-making process.
• The expected result is a highly accurate model (85-90% accuracy) that balances performance
with transparency, allowing financial institutions to make informed, data-driven lending
decisions while minimizing risk.

Department Of Computer Engineering, Watumull

Institute.
References
1. Barthe, S., Sanyal, S., Biswas, S. K., & Purkayastha, B. (2023). "Credit Risk Prediction
Using Genetic Algorithm and Stacking Ensemble." Journal of Financial Analytics.
2. Sikha, S., Vijayakumar, A., & Yemmanuru, P. K. (2023). "Explainable AI Using H2O
AutoML in Credit Risk Management." International Journal of Data Science.
3. Yeboah, J., & Nti, I. K. (2024). "Customer Credit Risk: Application of Machine Learning
and Deep Learning Models." Financial Risk Management Journal.
4. Vohra, N., & Goyal, D. (2023). "Optimizing Loan Approval Decisions with Ensemble
Learning." Financial Technology & Decision Systems.
5. Dhawan, K., & Jayakumar, N. (2023). "A Novel Approach to Optimize Credit Risk Using
Enhanced SVM." Journal of Risk and Model Optimization.

Department Of Computer Engineering,Watumull

Institute,
THANKYOU

Department Of Computer Engineering,Watumull

Institute,

Loan Approval - PPT
No ratings yet
Loan Approval - PPT
19 pages
(Ebook PDF) Analysis and Design of Algorithms 3rd Ed. Edition by Amrinder Arorainstant Download
100% (7)
(Ebook PDF) Analysis and Design of Algorithms 3rd Ed. Edition by Amrinder Arorainstant Download
65 pages
ISE 303 - Chapter 3
No ratings yet
ISE 303 - Chapter 3
58 pages
Loan Prediction Project Report
No ratings yet
Loan Prediction Project Report
3 pages
Linear Programming Optimization Method
No ratings yet
Linear Programming Optimization Method
20 pages
For Loan Approval Prediction
100% (1)
For Loan Approval Prediction
14 pages
Fluent Simulation and Modeling Techniques: Definitive Reference for Developers and Engineers
From Everand
Fluent Simulation and Modeling Techniques: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Solving Transportation Problem Using Vogel's Approximation Method, Stepping Stone Method & Modified Distribution Method
No ratings yet
Solving Transportation Problem Using Vogel's Approximation Method, Stepping Stone Method & Modified Distribution Method
38 pages
Action N Pose Estimation
No ratings yet
Action N Pose Estimation
84 pages
CORRELATION & REGRESSION Notes For Mba
100% (1)
CORRELATION & REGRESSION Notes For Mba
62 pages
Development of A Machine Learning-Based Financial Risk Control Sy
No ratings yet
Development of A Machine Learning-Based Financial Risk Control Sy
70 pages
Credit Card Fraud Detection
No ratings yet
Credit Card Fraud Detection
89 pages
Report Merged
No ratings yet
Report Merged
48 pages
OptimizingAIandMLModeling v5
No ratings yet
OptimizingAIandMLModeling v5
54 pages
RTA-CSIT 2023 Paper 20
No ratings yet
RTA-CSIT 2023 Paper 20
9 pages
Matlab Code For Radial Basis Functions
100% (2)
Matlab Code For Radial Basis Functions
13 pages
Pid Temperature Control
100% (1)
Pid Temperature Control
6 pages
Part B - Dinesh G - 1ox22mc068
No ratings yet
Part B - Dinesh G - 1ox22mc068
45 pages
Loan Prediction 10
No ratings yet
Loan Prediction 10
10 pages
ECEN 5682 Theory and Practice of Error Control Codes: Introduction To Block Codes
No ratings yet
ECEN 5682 Theory and Practice of Error Control Codes: Introduction To Block Codes
61 pages
Bank Loan Prediction Using ML
No ratings yet
Bank Loan Prediction Using ML
65 pages
Pulse Modulation Schemes
No ratings yet
Pulse Modulation Schemes
55 pages
Mathematical Modeling and Analysis of Credit Scori
No ratings yet
Mathematical Modeling and Analysis of Credit Scori
28 pages
School of Information Technology and Engineering M.Tech Software Engineering (Integrated) FALL SEMESTER 2020 - 2021
No ratings yet
School of Information Technology and Engineering M.Tech Software Engineering (Integrated) FALL SEMESTER 2020 - 2021
36 pages
Make 06 00004
No ratings yet
Make 06 00004
25 pages
Preprints202502 2059 v1
No ratings yet
Preprints202502 2059 v1
19 pages
BDCC 08 00028
No ratings yet
BDCC 08 00028
22 pages
Analysis and Transmission of Signals: EELE 3370
No ratings yet
Analysis and Transmission of Signals: EELE 3370
27 pages
4 - Introduction To Probability
No ratings yet
4 - Introduction To Probability
55 pages
Credit Score Kisutsa - Loan Default Prediction Using Machine Learning, A Case of Mobile Based Lending
No ratings yet
Credit Score Kisutsa - Loan Default Prediction Using Machine Learning, A Case of Mobile Based Lending
51 pages
Data 08 00169
No ratings yet
Data 08 00169
17 pages
The Maximization of Points in The Fantasy Premier League Subject To The Constraints
No ratings yet
The Maximization of Points in The Fantasy Premier League Subject To The Constraints
14 pages
Report
No ratings yet
Report
15 pages
SSRN 5088929
No ratings yet
SSRN 5088929
11 pages
9 - Functional Data Analysis With Application To Periodically
No ratings yet
9 - Functional Data Analysis With Application To Periodically
13 pages
Smart Credit Card Approval Prediction System Using
No ratings yet
Smart Credit Card Approval Prediction System Using
10 pages
Daa QB
No ratings yet
Daa QB
11 pages
Ijiset Ncisct 220503
No ratings yet
Ijiset Ncisct 220503
9 pages
Published Research Paper
No ratings yet
Published Research Paper
9 pages
Paper 14014
No ratings yet
Paper 14014
9 pages
Equipment Operational Reliability Evaluation Metho
No ratings yet
Equipment Operational Reliability Evaluation Metho
9 pages
Optimized Machine Learning Models For Predictive Analysis AI-Driven Analytical Tools For Enhanced Credit Risk Assessment
No ratings yet
Optimized Machine Learning Models For Predictive Analysis AI-Driven Analytical Tools For Enhanced Credit Risk Assessment
8 pages
Reasearchby AK0102
No ratings yet
Reasearchby AK0102
7 pages
Updated AI-Driven CIBIL Score Analysis and Prediction System-3
No ratings yet
Updated AI-Driven CIBIL Score Analysis and Prediction System-3
7 pages
Class Xii
No ratings yet
Class Xii
7 pages
Bank Loan Approval Prediction Using Data Science Technique (ML)
No ratings yet
Bank Loan Approval Prediction Using Data Science Technique (ML)
10 pages
Sibasis Panigrahy - M.Tech. Synopsis - 3rd Sem
No ratings yet
Sibasis Panigrahy - M.Tech. Synopsis - 3rd Sem
9 pages
Machine Learning-Driven Credit Risk: A Systemic Review
No ratings yet
Machine Learning-Driven Credit Risk: A Systemic Review
13 pages
1 s2.0 S2666307423000293 Main
No ratings yet
1 s2.0 S2666307423000293 Main
13 pages
Properties of Pure Materials
No ratings yet
Properties of Pure Materials
18 pages
Finance Project Proposal
No ratings yet
Finance Project Proposal
7 pages
Model of Conventional Encryption
No ratings yet
Model of Conventional Encryption
10 pages
Credit Loan Default Prediction
No ratings yet
Credit Loan Default Prediction
22 pages
C5 IEEE CreditRiskScoringAnalysisBasedonMachineLearningModels
No ratings yet
C5 IEEE CreditRiskScoringAnalysisBasedonMachineLearningModels
6 pages
An Automatic Credit Analysis Model
No ratings yet
An Automatic Credit Analysis Model
12 pages
Customer Credit Risk Application and Evaluation of Machine Learning and Deep Learning Models
No ratings yet
Customer Credit Risk Application and Evaluation of Machine Learning and Deep Learning Models
5 pages
BDT Research Paper
No ratings yet
BDT Research Paper
4 pages
06 Practice Tests Set 18 (Set 2) - Paper 1H Mark Scheme
No ratings yet
06 Practice Tests Set 18 (Set 2) - Paper 1H Mark Scheme
15 pages
Research Paper ALAS
No ratings yet
Research Paper ALAS
4 pages
Paper+13+ (2023.5.6) +Machine+Learning Based+Risk
No ratings yet
Paper+13+ (2023.5.6) +Machine+Learning Based+Risk
17 pages
EasyChair Preprint 8693
No ratings yet
EasyChair Preprint 8693
22 pages
minipptPOWER 1pdf
No ratings yet
minipptPOWER 1pdf
16 pages
Project Report Digital Image Watermarking EE381K-Multidimensional Signal Processing 12/5/98
No ratings yet
Project Report Digital Image Watermarking EE381K-Multidimensional Signal Processing 12/5/98
10 pages
Interpretable Credit Default Prediction With Ensemble Learning and SHAP
No ratings yet
Interpretable Credit Default Prediction With Ensemble Learning and SHAP
5 pages
ST Stephen Girl College F3-Maths-17-18-my
No ratings yet
ST Stephen Girl College F3-Maths-17-18-my
12 pages
SLA in EBS R12 Vs Fusion Financials
No ratings yet
SLA in EBS R12 Vs Fusion Financials
2 pages
ABSTRACT
No ratings yet
ABSTRACT
2 pages
Ieeexplore Ieee
No ratings yet
Ieeexplore Ieee
2 pages
Loan Default Prediction Using Machine Learning
No ratings yet
Loan Default Prediction Using Machine Learning
5 pages
Madaan 2021 IOP Conf. Ser. Mater. Sci. Eng. 1022 012042
No ratings yet
Madaan 2021 IOP Conf. Ser. Mater. Sci. Eng. 1022 012042
13 pages
10.3934 Dsfe.2024009
No ratings yet
10.3934 Dsfe.2024009
14 pages
Question 1 (10 PTS) : COMP335 F2015 Assignment 1 - Page 1 of 4
No ratings yet
Question 1 (10 PTS) : COMP335 F2015 Assignment 1 - Page 1 of 4
4 pages
Machine Learning
No ratings yet
Machine Learning
26 pages
IJCSIS Camera Ready Academia
No ratings yet
IJCSIS Camera Ready Academia
11 pages
Yousra 032
No ratings yet
Yousra 032
11 pages
Synopsis
No ratings yet
Synopsis
9 pages
PREDICTING BANK CREDIT RISK USING DATA MINING Group SIX
No ratings yet
PREDICTING BANK CREDIT RISK USING DATA MINING Group SIX
5 pages
PP - Nov Winter 2019
No ratings yet
PP - Nov Winter 2019
2 pages
Ieee Paper1
No ratings yet
Ieee Paper1
6 pages
Whittle Instructions
No ratings yet
Whittle Instructions
20 pages
Published Research Paper
No ratings yet
Published Research Paper
6 pages
CSD
No ratings yet
CSD
19 pages
Wa0001.
No ratings yet
Wa0001.
8 pages
Synopsis of Lep 01
No ratings yet
Synopsis of Lep 01
8 pages
Ds & ML Project (IBM)
No ratings yet
Ds & ML Project (IBM)
9 pages
Loan Eligibility Prediction
No ratings yet
Loan Eligibility Prediction
12 pages
Chapter 8 Homework Assignment Ver 1.0
No ratings yet
Chapter 8 Homework Assignment Ver 1.0
2 pages
Resume-Mayank Goel
No ratings yet
Resume-Mayank Goel
4 pages
Machine Learning-Driven Credit Risk A Systemic Rev
No ratings yet
Machine Learning-Driven Credit Risk A Systemic Rev
14 pages
1157 CS F425 20231222015056 Mid Semester Question Paper DL
No ratings yet
1157 CS F425 20231222015056 Mid Semester Question Paper DL
2 pages

Major Project

Uploaded by

Major Project

Uploaded by

An Approach to Credit Risk Assessment through Multimodal Data Fusion

Yash Khandagale, Manish Kapal, Arya Shinde, Sanika Wani

Department Of Computer Engineering

Department Of Computer Engineering,Watumull

Department Of Computer Engineering,Watumull

Department Of Computer Engineering,Watumull

Department Of Computer Engineering,Watumull

Department Of Computer Engineering,Watumull

Department Of Computer Engineering, Watumull

Department Of Computer Engineering,Watumull

Department Of Computer Engineering,Watumull

Department Of Computer Engineering,Watumull

Department Of Computer Engineering, Watumull

Department Of Computer Engineering,Watumull

Department Of Computer Engineering,Watumull

You might also like