神经网络的可解释性：信用卡违约模型示例-研究论文资源-CSDN下载

需积分: 50 58 浏览量 2021-06-09 23:23:38 上传评论 1 收藏 963KB PDF 举报

资源推荐

资源详情

资源评论

Interpretability of Neural Networks: a credit card

default model example

Ksenia Ponomareva

[email protected]

Simone Caenazzo

[email protected]

Abstract

Neural networks have risen in popularity for a number of applications, also in quan-

titative ﬁnance. However, the low interpretability of their ‘black box’ representation

has always been a common criticism. Previous literature has attempted to provide

a better understanding and visualisation of neural networks, focusing primarily on

image classiﬁcation. This paper shows the feasibility of applying the same methods

to an example deep neural network model, concerned with the estimation of credit

risk for a portfolio of credit cards. Results show that the analysis of relevance,

sensitivity and neural activities can increase the interpretability of a neural network

in a ﬁnancial modelling context.

1 Introduction and motivation

Historically, the widespread use of advanced deep learning models in sensitive ﬁelds like medicine

and ﬁnance has been hindered by a fundamental lack of human interpretability regarding the outcomes

of such advanced models. Simpler techniques such as linear or logistic regressions yield outcomes

which are deterministic in nature and follow mechanics which are well understood and controlled

by model developers and analysts. Deep neural networks, however, have a large number of hidden

layers and neurons, the exact roles of which are not easily understood by humans [1].

The interpretability issue in the ﬁnancial services context has started to receive broad coverage

in recent times. Examples can be found in [

], [

] and [

]. In [

], the interpretability issue of

neural-network-based models has been introduced in the context of Retail Banking, looking at the

opposing challenges in data analysis and model interpretability that ﬁnancial institutions are facing.

A number of research streams have recently been seeking solutions for the issue of interpreting deep

neural network models. Among them, are the following:

• Relevance analysis

: how much of the output (e.g. a probability of default) is directly due

to a given input variable?

• Sensitivity analysis

: how much does the output change subject to a (small) change in a

given input variable?

• Neural activity analysis: which neural paths are most activated by a given input variable?

Another promising research avenue is found in [

], where a novel class of neural networks (called

Deep Fundamental Factor models) with enhanced information ratios is introduced in the context of

multi-factor asset models.

For larger banks and institutions these research outputs allow to consider development of more

sophisticated models that could not previously comply to rigorous regulatory standards; for smaller

ﬁntechs that already use artiﬁcial intelligence techniques, they provide a way of conducting deeper

model validation into their processes.

Electronic copy available at: https://ssrn.com/abstract=3519142

The primary focus of this paper is to show the feasibility of these methods in overcoming the

interpretability hurdles around the application of neural networks and deep learning in business

and/or risk processes in the ﬁnancial industry. This is achieved by analysing a credit/default risk

neural-network-based model in the context of a credit card portfolio. The analysis involves applying

a selection of relevance, sensitivity and neural activation analysis techniques, demonstrating their

ability to explain the model’s mechanics. It is worth clarifying, however, that the paper does not focus

on trying to structure a neural network model that outperforms other techniques for the particular

dataset at hand. Instead it focuses on developing foundations for understanding, examining and

modelling interpretability and explainability of deep learning models.

The rest of the paper is organised as follows. Section 2 provides details on the construction of the

dataset and architecture, as well as a brief discussion on the training and test results for the credit

card default model to be interpreted. Section 3 introduces the relevance analysis technique, Section 4

provides details for the sensitivity methods and Section 5 describes the neuron activity analysis. All

three sections illustrate techniques by analysing their application to the model examples. Conclusions

are drawn in Section 6.

2 Dataset and the neural network

2.1 Dataset and features

For this paper, a publicly available dataset from UCI machine learning repository, [

], has been used.

This data provides information on default payments, demographic factors, credit data, history of

payment and bill statements of credit card clients in Taiwan from April 2005 to September 2005.

This dataset has been used in various academic papers, see [

][

], as well as online machine

learning blogs, [11].

The input used in this paper has 23 features, similar to [

]. There are four demographic features

covering gender, education, marital status and age of each client. These are followed by 18 features

providing history of payment and bill statements, i.e. repayment status as well as amount of bill

statements and previous payments for the six consecutive months. The last feature is the amount of

credit given to the client. These input features have been scaled between zero and one to speed up

convergence of the gradient descent and accelerate training, as well as to enable easy comparison of

sensitivities. The ground truth, or the true label, output takes two values, zero for no default and one

if the client defaults on the next month’s payment.

Data is randomly shufﬂed and then split in such a way that

80%

is used for training and

20%

for

testing. In the original dataset

≈ 78%

of entries represent non-default. This ﬁnding is as expected for

this speciﬁc ﬁnancial context.

2.2 Network speciﬁcation

The model examined in this paper is a feed-forward neural network. The network topology has

been determined after a brief hyper-parameter tuning that experimented the usage of hyperbolic

tangent (tanh) and Rectiﬁed Linear Unit (ReLU) activation functions, as well as various numbers of

hidden layers and neurons. The ﬁnal model architecture was selected as the one that minimised the

discrepancy between accuracy metrics in the training and testing phases, whilst keeping training time

within reasonable levels.

The ﬁnal model topology comprises three hidden layers, see Figure 1 for the high-level architecture,

with 100, 50 and 10 nodes respectively. The three hidden layers have activation functions

A1, A2

and

respectively, the input vector can be considered as

, and the output has ﬁnal activation

All activations apart from the ﬁnal one are ReLU, where

ReLU (z) = max(0, z)

. Since predicting

whether a client would default or not in the next month is a binary classiﬁcation problem, the ﬁnal

activation is the sigmoid function, sig, where

A4 = sig(Z4) =

1 + e

−Z4

Electronic copy available at: https://ssrn.com/abstract=3519142

and

is the linear transformation applied to the output of the last hidden layer. To avoid over-ﬁtting,

neuron dropouts are implemented in the training phase across all hidden layers, with 65%, 50% and

25% dropout rates in the ﬁrst, the second and the last layers respectively.

Figure 1: High-level architecture for the considered neural network.

This model takes the dataset described in Section (2.1) as an input. The classiﬁcation output is a

default prediction,

Y = D(A4)

, obtained from a sigmoid score of the ﬁnal activation

as follows:

D(A4) =



1 if A4 = sig(Z4) > 0.5,

0 otherwise.

(1)

2.3 Results analysis

The neural network, once trained, achieves

≈ 82%

accuracy both on the train and the test datasets.

This might be considered reasonable accuracy for a binary classiﬁcation and, in fact, to the authors’

best knowledge, it has not been exceeded for this particular dataset in the published literature so far.

However, since the dataset is highly dominated by non-default entries, overall accuracy alone will not

provide enough information on how well the model predicts defaults.

Firstly, other metrics such as precision, recall and F1 score should be considered, where:

precision =

T P

T P + F P

, recall =

T P

T P + F N

, F 1 = 2 ×

precision × recall

precision + recall

Here, TP stands for true positives (both prediction and true label are default), FP stands for false

positives (prediction is default but true label states no default occurs) and FN means false negatives

(prediction is no default and true label is default). Table 1 shows the normalised confusion matrix for

the test dataset, based on which, it can be concluded that the model has

71%

precision,

31%

recall

and F1 score of

43%

. This means that

71%

of the clients, for whom defaults are predicted by the

model, would indeed default. However, out of all the client defaults that occur in the next month,

only 31% are correctly identiﬁed by the model.

Predicted Default Predicted Non-Default

Actual Default 6.88% (TP) 15.10% (FN)

Actual Non-Default 2.82% (FP) 75.20% (TN)

Table 1: Normalised confusion matrix for the test dataset.

Secondly, the receiver operating characteristic (ROC) curve obtained from test data should be

examined, see Figure 2 for the details. The ROC curve is obtained as a scatter plot of true positive

rate (TPR) versus false positive rate (FPR) for increasing values of the decision threshold in Equation

(1). TPR and FPR are deﬁned as:

T P R =

T P

T P + F N

, F P R =

F P

F P + T N

Electronic copy available at: https://ssrn.com/abstract=3519142

剩余16页未读，继续阅读

评论收藏

内容反馈

weixin_38625442

粉丝: 6

神经网络的可解释性：信用卡违约模型示例-研究论文

机器学习方法R实现-用决策树、神经网络等九种机器学习方法对信用卡违约率建模

论文研究-银行集中信用违约预警模型.pdf

信用卡评分模型源数据

金融风控基于Python的全栈开发与可解释性实践：金融数智化风控模型端到端解决方案

CCF大数据与计算智能比赛-个贷违约预测.zip

基于Python的申请信用评分卡模型分析.zip

银行客户违约信息分析（数据挖掘）

基于HMM模型的信用卡欺诈检测-研究论文

论文研究-基于BSSDEs的一般跳过程的可违约期权的定价模型.pdf

论文研究-信用违约风险模型中违约概率的统计推断.pdf

基于数据挖掘方法对商业银行信用卡违约预测模型的研究.pdf

论文研究-基于违约强度信用久期的资产负债优化模型.pdf

Capstone_Project_Home_credit_dafault_risk:住房信贷违约风险的顶峰项目

Give Me Some Credit

泰迪杯实战案例超深度解析：基于多源数据的信用风险评估与反欺诈检测

wtte-rnn-examples:WTTE-RNN的实现示例

GiveMeSomeCredit.zip

论文研究 - 金融科技的新趋势：金融领域人工智能模型可解释性研究

论文研究-基于供应链金融违约风险的保理决策收益模型仿真.pdf

论文研究 - 基于解释性结构模型的重要用户供电中断风险机理研究

awesome_deep_learning_解释性：深度学习特定关于神经网络模型解释性的相关高引用顶会论文（附代码）

论文研究-考虑违约情况下累积分红寿险的退保权定价模型.pdf

16 - Python贷款风险预测

机器学习-使用机器学习算法进行银行客户风险评估.zip

matlab 工具箱 包括很多函数的程序

IERG 5320 Assignment2 Reference

VC6.0中为为什么在工作线程执行操作UI控件就会报错了，这个如何解决

最新资源

matlab 工具箱包括很多函数的程序