0% found this document useful (0 votes)
92 views29 pages

Predicting Customer Churn in Energy Sector

The document discusses predicting customer churn in the energy sector through predictive modeling. It describes the customer lifecycle and importance of retention. The methodology involves acquiring customer data, cleaning outliers and missing values, selecting predictive features, and using machine learning algorithms like decision trees, logistic regression and neural networks to build predictive models of churn. Models are trained on a sample and tested on holdout data to evaluate performance. Class imbalance is addressed through resampling approaches like SMOTE. The goals are to develop an accurate predictive model and understand customer psychology related to churn.

Uploaded by

Kimesha Brown
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
92 views29 pages

Predicting Customer Churn in Energy Sector

The document discusses predicting customer churn in the energy sector through predictive modeling. It describes the customer lifecycle and importance of retention. The methodology involves acquiring customer data, cleaning outliers and missing values, selecting predictive features, and using machine learning algorithms like decision trees, logistic regression and neural networks to build predictive models of churn. Models are trained on a sample and tested on holdout data to evaluate performance. Class imbalance is addressed through resampling approaches like SMOTE. The goals are to develop an accurate predictive model and understand customer psychology related to churn.

Uploaded by

Kimesha Brown
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Should They Stay or Should They Go?

The Prediction of Customer Churn in


Energy Sector

Michela Vezzoli (University of Milano-Bicocca)


&
Cristina Zogmaister (University of Milano-Bicocca)

Big Data in Psychology 2018


Churn Behaviour
Customer Lifetime
Acquisition Phase Retention Phase

Cross-Sell & Up-Sell


Profits

Acquisition

Time
Big Data in Psychology 2018
Churn Behaviour
Customer Lifetime
Acquisition Phase Retention Phase Churn Phase

Cross-Sell & Up-Sell


Profits

Acquisition Churn Win Back

Time
Big Data in Psychology 2018
Why study churn behaviour?
Economic reasons
Attracting new customers costs 5 to 6 times more than
retaining of the existing ones

Long-term customers generate more profits

Long-term customers are less sensitive to competitors’


marketing campaigns

Long-term customers are less costly to maintain over time

Long-term customers provide new referrals through positive


word-of-mouth
Big Data in Psychology 2018
Why study churn behaviour?
Psychological reasons
More importantly, churners are
unhappy, unsatisfied and no-
more-loyal customers

Big Data in Psychology 2018


How to study Churn: Making predictions

Stay

Historical
Information on Predictive Modelling Prediction
Customers
Go

Machine Learning

Big Data in Psychology 2018


The aims of the study

Shed light on

1 2 3
Develop the Understand
the
predictive predictive
consumer
model relationships
psychology

Big Data in Psychology 2018


Methodology for developing predictive
churn models
Predictive modelling turns data into information and
information into insight

It does not demand a priori hypotheses ➜ Data Driven

The Data Mining Process

Model
Data Data Model
Training and Action
Acquisition Cleaning Testing
Tuning

Big Data in Psychology 2018


Methodology for developing predictive
churn models
Predictive modelling turns data into information and
information into insight

It does not demand a priori hypotheses ➜ Data Driven

The Data Mining Process

Model
Data Data Model
Training and Action
Acquisition Cleaning
Tuning
Testing

Big Data in Psychology 2018


Methodology for developing predictive
churn models
Predictive modelling turns data into information and
information into insight

It does not demand a priori hypotheses ➜ Data Driven

The Data Mining Process

Data Model
Data Model
Training and Action
Acquisition Cleaning Tuning
Testing

Big Data in Psychology 2018


Methodology for developing predictive
churn models
Predictive modelling turns data into information and
information into insight

It does not demand a priori hypotheses ➜ Data Driven

The Data Mining Process

Model
Data Data Model
Training and Action
Acquisition Cleaning Testing
Tuning

Big Data in Psychology 2018


Methodology for developing predictive
churn models
Predictive modelling turns data into information and
information into insight

It does not demand a priori hypotheses ➜ Data Driven

The Data Mining Process

Model Model
Data Data
Training and Action
Acquisition Cleaning
Tuning Testing

Big Data in Psychology 2018


Methodology for developing predictive
churn models
Predictive modelling turns data into information and
information into insight

It does not demand a priori hypotheses ➜ Data Driven

The Data Mining Process

Model
Data Data Model
Acquisition Cleaning
Training and
Testing Action
Tuning

Big Data in Psychology 2018


Data acquisition and cleaning
Outliers Treatment

Handling Missing Values

820.123 81.813
Customers Defining Time Window Customers

Feature Selection

CRM Getting Know the Data Electricity

Residential
BILLING
Single POD
CAMEO

Original Datasets Data Cleaning Targeted Population


Predictors
Socio-Demographics
Age, Sex, Regional area

Account
Customer Type, Length of the contract, Acquisition Channel, Loyalty Program
Member, Payment method, Online Billing

Behavioural
Number of complaints, Number of change offer, Number of contacts, Number
of retention proposal, Contract starts with a transfer, Number of cross sell
proposal, Digital customer, Number of previously churn

Socio-Economics
Socio-economic status, Presence of adults over 60, Presence of children,
Household size, Education, Building age
Train – Test split

Training
Train Set Model
Data

Test Set Test model Performance

Total Dataset Training Set Test Set


N of Churner (%) 6899 (8.4%) 4848 (8.5%) 2050 (8.3%)
N of Non-Churner (%) 74915 (91.6%) 54421 (91.5%) 22494 (91.7%)
Total 81836 57269 24544
Class imbalance

Number of non-churners is far higher


than the number of churners

Resampling Approach: SMOTE

SMOTE Training Sample


N of Churner (%) 24240 (45.5 %)
N of Non-Churner (%) 29088 (54.5 %)
Total 53328
Modelling phase: Training and testing
Churn prediction is a supervised learning Decision Tree

Interpretability
classification task Logistic Regression

Support Vector Machine

Random Forest

Neural Network

Accuracy

Big Data in Psychology 2018


Modelling phase: Training and testing
Churn prediction is a supervised learning Am I hungry?
Yes No
classification task
Have I 25 €? Go to sleep

Yes No
Decision Tree (CART and C5.0)
Go to Buy
restaurant hamburger

Big Data in Psychology 2018


Modelling phase: Training and testing
Churn prediction is a supervised learning
classification task

Decision Tree (CART and C5.0)


Logistic Regression

Big Data in Psychology 2018


Modelling phase: Training and testing
Churn prediction is a supervised learning
1

classification task

True Positive Rate


0.5

Decision Tree (CART and C5.0)


Logistic Regression
0 0.5 1

False Positive Rate

Performance measure: Area Under the ROC Curve (AUC)


Range values from 0.5 (random model) to 1 (perfect model)

Big Data in Psychology 2018


Results
Model AUC Train AUC Test
CART Unbalance 0,51 0,51

CART + SMOTE 0,76 0,56

C5.0 Unbalance 0,51 0,51

C5.0 + SMOTE 0,73 0,56

Logistic Regression Unbalance 0,67 0,68

Logistic Regression + SMOTE 0,77 0,63

Big Data in Psychology 2018


Odds ratio: Socio-demographic
predictors
Center 1.17

North East 1.01


Regional Area
South 1.40

Islands 1.14

Sex Female 0.99

No relationship

Age 0.99 Decrease the likelihood of churn

Increase the likelihood of churn


Odds ratio: Account predictors
Agency 2.17

Counter 0.47

Acquisition
Call Center 1.09
Channel

Web 1.18

Tele-selling 1.25

Customer Type Dual 0.89 No relationship

Decrease the likelihood of churn

Payment Method RID 0.95 Increase the likelihood of churn


Odds ratio: Account predictors
Lenght of the Contract 0.986

On Line Billing Yes 0.84

Start with
Transfer
Yes 0.89

Loyalty Program
Member
Yes 0.78

No relationship
Digital Customer Yes 0.95
Decrease the likelihood of churn

Increase the likelihood of churn


Odds ratio: Behavioural predictors

Number of
Cross-Sell Proposal 0.70 Contacts
1.04

Number of
Change Offer 0.449 Complaints 1.36

Retention Proposal 1.478 Previously Churn 1.50

No relationship

Decrease the likelihood of churn

Increase the likelihood of churn


Odds ratio: Socio-economic predictors

Socio-Economic
1.08 Household Size 0.99
Status

Presence of adults
1.02 Building Age 0.97
over 60

Presence of children 1.02 Education 0.99

No relationship

Decrease the likelihood of churn

Increase the likelihood of churn


Discussion

Contributions to the knowledge on consumers’ churn behaviour

Implement machine learning techniques and data mining


methodology into the consumer psychology research

Logistic regression outperformed CART and C5.0 decision trees.


Moreover, the logistic regression has shown to be robust

Big Data in Psychology 2018


Limitations and Developments
Consider other predictors

Examine multiple retailers

Actual behaviours of electricity consumers ➜ Ecological Validity

Disentangle the causality of some of the effects we found

Big Data in Psychology 2018

You might also like