0% found this document useful (0 votes)
50 views6 pages

Comparative Analysis of Predictive Models For Customer Churn Prediction in The Telecommunication Industry

Uploaded by

snehadhake1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views6 pages

Comparative Analysis of Predictive Models For Customer Churn Prediction in The Telecommunication Industry

Uploaded by

snehadhake1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

2024 International Conference on Communication, Computer Sciences and Engineering (IC3SE)

Comparative Analysis of Predictive Models for


Customer Churn Prediction in the
Telecommunication Industry
Deepika Christopher Garima Anand
CHRIST (Deemed to be University) CHRIST (Deemed to be University)
Delhi NCR Delhi NCR
2024 International Conference on Communication, Computer Sciences and Engineering (IC3SE) | 979-8-3503-6684-6/24/$31.00 ©2024 IEEE | DOI: 10.1109/IC3SE62002.2024.10592931

[email protected] [email protected]

Abstract— To determine the best model for churn statistical techniques including survival analysis, logistic
prediction in the telecom industry, this paper compares 11 regression, and Markov chains, together with machine
machine learning algorithms namely Logistic Regression, learning techniques such as decision trees, neural networks,
Support Vector Machine, Random Forest, Decision Tree, and support vector machines, to create prediction models.
XGBoost, LightGBM, Cat Boost, AdaBoost, Extra Trees, Deep These results enable industries to create successful customer
Neural Network, and Hybrid Model (MLPClassifier). It also retention strategies by providing insight into and forecasting
aims to pinpoint the top three factors that lead to customer customer attrition[5].
churn and conducts customer segmentation to identify
vulnerable groups. The results indicate that the Logistic As customer rates rise in the telecom business, there is a
Regression model performs the best, with an F1 score of 0.6215, pressing demand for predictive models that are reliable in
81.76% accuracy, 68.95% precision, and 56.57% recall. The identifying consumers who may be considering transferring.
top three attributes that cause churn are found to be tenure, However, selecting the most accurate model to predict
Internet Service Fiber optic, and Internet Service DSL; turnover can be challenging due to the abundance of
conversely, the top three models in this article that perform the available techniques and the lack of comparison research to
best are Logistic Regression, Deep Neural Network, and inform choices.
AdaBoost. The K means algorithm is applied to establish and
analyze four different customer clusters. This study has II. LITERATURE REVIEW
effectively identified customers that are at risk of churn and
may be utilized to develop and execute strategies that lower One of the most important issues in the telecom industry
customer attrition. that has to be carefully considered and addressed is client
attrition. M. Pondel et al’s study[6] concentrated on creating
Keywords-attrition, retention, customer segmentation a deep learning model for e-commerce customer attrition
prediction, which is essential for business expansion
I. INTRODUCTION regardless of size or sales channel. The experiment used real
Customer churn occurs when customers unsubscribe or e-commerce data, which presents a difficult situation for
quit using a service that they were continuously a part of, prediction accuracy because a large percentage of purchases
which in turn leads to financial loss. Predictive analytics is are one-time consumers. In the work by Prabadevi et al.,[7]
useful to determine customers who are at risk of churn. customer churn is investigated through the use of machine
learning. Several techniques, including stochastic gradient
One important factor stressed in many studies is that booster (83.9%), random forest (82.6%), logistics regression
customer acquisition is costlier compared to customer (82.9%), and k-nearest neighbors (78.1%), are used to
retention [1], [2]. All customers are essentially associated achieve high accuracy rates.
with profit since they bring in profit. Therefore,
understanding the customer and their needs is extremely Adaboost and XGBoost classifiers achieved the highest
important. Churn prediction is a crucial step required to stay accuracy as a result of the study by Lalwani et al[8]. Seven
relevant or survive in a competitive market [3]. algorithms were utilized in the study that applied the most
number of predictive models[9]. A hybrid churn prediction
The industry leader in communication, telecom, is in the model combining XGBoost and multi-layer perceptron
face of saturation and shifting consumer preferences. As (MLP) approaches is proposed by Tang et al. [10] The model
expenses rise, carriers are shifting their focus from performs better when it uses MLP for one-hot vector
acquisition to retention. Using data analytics, churn transformation and XGBoost for handling numerical features.
prediction in telecom is essential for identifying possible Comparing this strategy to more conventional approaches,
defectors and what factors lead them to it. A proactive experimental data validate its efficacy in predicting client
approach identifies and targets at-risk clients based on billing, attrition.
consumption, and demographics. The goal is to increase
telecom profitability in highly competitive markets by Jayaswal et al. 2016[11] propose an ensemble approach
lowering churn rates[4]. for churn prediction in telecom, focusing on customer
retention. Decision trees and ensembles analyze customer
Prior studies on employee churn across several behavior, implemented with Apache Spark for efficiency.
businesses have identified common factors that influence this Hyperparameter optimization enhances accuracy in churn
behavior, including demographics, switching prices, service prediction. The work by Sharmila et al. 2024[12] emphasizes
utilization, and customer satisfaction. Researchers have used the value of retention strategies over client acquisition, with

534
979--83503-6684-6/24/$31.00 ©2024 IEEE

Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 22,2024 at 16:30:33 UTC from IEEE Xplore. Restrictions apply.
2024 International Conference on Communication, Computer Sciences and Engineering (IC3SE)

a particular focus on customer churn prediction in the The next step would be to identify the categorical
telecom industry. With the Random Forest classifier, it variables in the dataset and encode them. This is necessary
attains 99% accuracy, demonstrating excellent recall and because most algorithms require numerical inputs. Out of
precision rates for successful churn prediction and profit the 21 variables- gender, Partner, Dependents, Phone Service,
maximization. Nevertheless, the decision tree classifier Multiple Lines, Internet Service, Online Security, Online
models' initial poor performance on imbalanced datasets Backup, Device Protection, Tech Support, Streaming TV,
points to a possible drawback, implying that the model's Streaming Movies, Contract, Paperless Billing, Payment
efficacy can be jeopardized when handling skewed data Method, and Churn are categorical variables which are
distributions. This emphasizes the need for additional studies converted into numerical representations using
to address issues related to data imbalance in tasks involving ‘OneHotEncoder’.
churn prediction.
Finally, Feature scaling is performed using
With ANN demonstrating the greatest accuracy among ‘StandardScaler’, which ensures that all the numerical
machine learning techniques, Khodabandehlou and Zivari features have the same scale by removing the mean and
Rahman 2017 [13] suggest a predictive framework for scaling to unit variance. This prevents features with larger
customer churn prediction, pinpointing RFMITSDP variables magnitudes from dominating those with smaller magnitudes
as key predictors and obtaining 97.92% accuracy. However, during model training.
generalizability is limited by a short data period and single-
store data collection; care should be taken when B. Model Selection
extrapolating results to other industries. In telecom, SVM- A variety of machine learning algorithms exist to perform
POLY employing AdaBoost achieves almost 97% accuracy predictions. It is important to choose the right model that
and over 84% F-measure for customer churn prediction, would give the best possible result for the chosen dataset.
demonstrating the superior performance of boosted versions Since this research is a comparative study, it explores 11
over plain models. There is no discussion of generalizability chosen algorithms.
to other industries or datasets, which raises the possibility of Logistic regression is a technique for estimating the
limitations in broader applicability; more investigation may likelihood of a discrete outcome given an input variable. A
be required for validation across various contexts[14]. binary result, or something that can have two values, such as
It is observed that businesses are increasingly using true or false, yes or no, and so on, is what most logistic
machine learning models to predict customer attrition. regression models represent. When there are more than two
However, comprehensive comparison studies are still distinct discrete outcomes in a scenario, multinomial logistic
necessary to find the optimal model. Furthermore, there is a regression can model the situation. A helpful analysis
lack of research in studying the effectiveness of these models technique for classification issues is logistic regression,
to identify the major causes of churn which is a crucial step which is applied when attempting to ascertain which group a
for creating focused retention strategies. fresh sample most closely belongs to [16].

III. METHODOLOGY SVM is a supervised machine learning algorithm that can


be used for both classification and regression. It is based on
The objective of this research is to compare 11 Boser et al.'s statistical learning theory (1992). The
algorithms namely - Logistic Regression, Support Vector fundamental idea behind support vector machines (SVMs) is
Machine, Random Forest, Decision Tree, XGBoost, the process of determining the proper hyperplane for
LightGBM, Cat Boost, AdaBoost, Extra Trees, Deep Neural classification by using the points of each class that are
Network, and Hybrid Model (MLPClassifier) out of which present at the margin. [17]
LightGBM and CatBoost are new algorithms that have very
limited usage in this field. The Random Forest classifier is an ensemble technique
known as "bagging" that trains multiple decision trees
The dataset used in this paper is the Telco Customer concurrently with bootstrapping and aggregation. [18]
Churn data from Kaggle which contains information about
customers of a telecommunications company with 21 Decision tree approaches are effective for making
features - Customer ID, gender, Senior Citizen, Partner, predictions because they combine statistical models and data
Dependents, tenure, Phone Service, Multiple Lines, Internet mining techniques. Morgan and Sonquist created decision
Service, Online Security, Online Backup, Device Protection, trees in 1963 as a means of identifying the factors that
Tech Support, Streaming TV, Streaming Movies, Contract, influence social situations. [19]
Paperless Billing, Payment Method, Monthly Charges, Total Freund and Schapire devised the first successful boosting
Charges and Churn. [15] algorithm in 1999, called Adaptive Boosting, or AdaBoost
A. Data preprocessing for short. AdaBoost focuses on enhancing performance in the
areas where the initial iteration or base learner of the model
Data preprocessing is a crucial step in preparing the data
fails. With a Bayesian classifier strategy that reduces the
for any machine learning model.
likelihood of misclassification by merging numerous weak
It is important to prepare the data before it is further classifiers, AdaBoost employs an iterative method to learn
analyzed. It involves cleaning the data to identify and handle from the errors of weak classifiers and transform them into
any inconsistencies, errors, or outliers. Missing value strong ones. A highly efficient, adaptable, and portable
imputation is performed to fill in or remove empty values in version of the optimized Gradient Boosting method is called
the dataset. XGBoost. The supervised branch of machine learning
includes the tree-based algorithm XGBoost. Although the
It is observed that the dataset does not contain any approach can be applied to regression problems as well as
missing values so approximation techniques are not required. classification problems, all of the formulas and examples in

535

Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 22,2024 at 16:30:33 UTC from IEEE Xplore. Restrictions apply.
2024 International Conference on Communication, Computer Sciences and Engineering (IC3SE)

this narrative are related to classification. XGBoost The models' findings will be further discussed in more
optimizes the system and improves the algorithm to improve detail in the next discussion.
the fundamental GBM framework. The LightGBM
algorithm handles a huge number of data samples and
features by utilizing two novel techniques: Exclusive Feature Now after our machine learning models have been
Bundling (EFB) and Gradient-based One-Side Sampling successfully implemented, we’ll try and determine the
(GOSS). Every case with a big gradient is stored by GOSS, feature importance of each variable in predicting customer
and samples randomly from the examples with smaller churn as either yes or no. Based on that we’ll identify the top
gradients. Although "Category" and "Boosting" are the three most significant features that contribute to customer
acronyms for "Category" and "Boosting," this algorithm can churn.
also handle textual and numeric features. However,
CatBoost has effective handling methods for tiny datasets
and categorical data. An oblivious tree or symmetric tree is C. Customer Segmentation
used by the CatBoost method[20].
Customer segmentation is the process of grouping similar
As an expansion of the random forest algorithm, Extra customers according to attributes useful to marketing, such
Trees is a machine learning strategy that reduces the as requirements, preferences, behavior, or demographics. To
likelihood of overfitting a dataset. Random Forest and Extra better understand consumers and develop marketing
Trees are both made up of a sizable number of decision trees, strategies that cater to their unique requirements and
each of whose prediction is taken into consideration while preferences, segmentation is used.
making the ultimate decision[21].
In this study, customer segmentation entails grouping
Although Deep Learning is based on the traditional clients according to their attributes, including demographics,
neural network, it performs noticeably better than its services utilized, and billing data, using clustering algorithms.
forerunners. Moreover, DL builds multi-layer learning The dataset is divided into four clusters (0, 1, 2, and 3) using
models by using graph technologies and transformations at the K-means clustering algorithm, each of which represents a
the same time. The latest advancements in deep learning (DL) different consumer group. Following the clustering process,
approaches have shown impressive results in a range of each cluster's characteristics are examined to identify the
applications, such as natural language processing (NLP), unique attributes of the clients within each market segment.
audio and speech processing, and visual data processing[22]. Businesses may better satisfy the needs of each segment by
customizing their marketing campaigns, product offers, and
Model selection is followed by training and evaluation.
customer service techniques thanks to this analysis's ability
The models are trained on training data and evaluated on test
to reveal significant trends and variations between client
data to assess performance. By default, the ratio for the train-
groups.
test split is 75% and 25%. This study splits the data into 80%
for training and 20% for testing.
For the evaluation of models- accuracy, precision, recall, D. Retention Strategy Formulation
and F1-score are determined. Following this, the best- To formulate strategies for retention, identify the key
performing model is selected. After the successful segments- the most valuable customer group and the most
implementation of the model, testing model accuracy is also vulnerable to churn.
a crucial step. The F1 score gives us an idea of how accurate
a model is. A higher score implies a better-trained model. Then define the goal for client retention- lower churn rate
in this case. This will be followed by specialized tactics for
TABLE I. MODEL PERFORMANCE SUMMARY key segments to maintain customer loyalty and lower
attrition.
Model Accuracy Precision Recall F1-
score
Logistic
Regression 0.8176 0.6895 0.5657 0.6215 E. Documentation and Reporting
Support Vector Record every step taken in the creation of the retention
Machine 0.8041 0.6790 0.4933 0.5714 plan, client segmentation, model selection, and data
Random Forest 0.7828 0.6218 0.4584 0.5278 preprocessing. Give an overview of the key findings from
Decision Tree 0.7140 0.4597 0.4584 0.4591 each section of the study, emphasizing any significant
XGBoost 0.7921 0.6333 0.5094 0.5646 patterns, trends, and revelations. Highlight the importance of
LightGBM 0.7977 0.6477 0.5174 0.5753 key performance indicators (KPIs), such as client segments,
Deep Neural churn rate, model performance metrics, and the effectiveness
Network 0.8176 0.6986 0.5469 0.6135 of retention strategies.
Hybrid Model 0.7779 0.5829 0.5657 0.5741
CatBoost 0.8077 0.6835 0.5094 0.5837 Ensure the stakeholders are informed of the results and
AdaBoost 0.8148 0.6892 0.5469 0.6099 implementation plans to ensure successful implementation.
Extra Trees 0.7771 0.6007 0.4718 0.5285

536

Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 22,2024 at 16:30:33 UTC from IEEE Xplore. Restrictions apply.
2024 International Conference on Communication, Computer Sciences and Engineering (IC3SE)

straightforward and interpretable nature of the logistic


regression model. The Telco customer churn dataset may
contain complicated nonlinear correlations between
characteristics and the target variable that DNNs can capture.
AdaBoost reduces overfitting and enhances generalization
performance by combining several weak learners into one
strong learner.
Logistic Regression is determined as the optimal model
based on which the top three features- Tenure, Internet
Service Fiber optic, and Internet Service DSL that
contribute to customer churn are identified.

Fig. 2. Logistic Regression Feature Importance

Let’s have a deeper look into these factors and how to


relate them in a business environment.
Tenure refers to the amount of time a customer has been
using a service. In this case- the Telco company. This has a
direct link to how well a business performs. Customers who
have been with the business for a longer period have a lesser
tendency to leave it compared to new customers.
Telco firms can concentrate on keeping new customers
by providing exclusive promos, tailored onboarding
experiences, and proactive customer assistance during the
first few months of their subscription to lessen the impact of
tenure on churn prediction.
Internet Service Fiber optic feature shows whether the
user has a fiber optic internet service subscription. High-
speed internet is frequently offered by fiber optic internet
services, which could draw users. It's important to keep in
mind, though, that some consumers may leave because they
are unhappy with the price or level of service. By resolving
any concerns regarding fiber optic internet service, such as
network dependability, consistency in speed, or cost, telco
providers can raise consumer happiness. Customer retention
can also be aided by competitive pricing, service bundling,
and incentives for long-term memberships.
Internet Service DSL also indicates whether the consumer
Fig. 1. Data Processing Pipeline
has a DSL (Digital Subscriber Line) internet service
subscription. Although DSL is often slower than fiber optic,
IV. RESULTS AND DISCUSSION some users may find it more cost-effective.
Telco businesses can improve the quality of DSL service by
The top three best-performing models are identified to be optimizing network performance, delivering value-added
Logistic Regression, Deep Neural Network, and AdaBoost services, and providing dependable customer assistance to
with an accuracy of 81.76%, 81.76%, and 81.48% and F1 lessen the influence of DSL service on churn. Customers
scores of 0.6215, 0.6135, and 0.5469 respectively. may also be persuaded to stick around by providing
Understanding the link between the independent incentives for switching to faster services or combining DSL
variables and the target variable is made easier by the with other services.

537

Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 22,2024 at 16:30:33 UTC from IEEE Xplore. Restrictions apply.
2024 International Conference on Communication, Computer Sciences and Engineering (IC3SE)

These features have a direct link to impacting the loyalty benefits and discounts while accommodating
business and it is important to address these issues to reduce customers with dependants by offering bundled (family-
churn. oriented) services.
With only 15.9% leaving, Cluster 1 likewise has a lower
A. The result of Customer Segmentation- attrition rate. Since the cluster has a high proportion of
customers with long tenure, personalized offers are a
TABLE II (A)- COMPARISON OF CLUSTERS SUMMARY reasonable way to retain customers. It can be noted that 68.1%
of the customers have partners, therefore, couple- packages
Cluster Senior Tenure Monthly Internet or discounts can be provided. Maintaining customer
Citizen Charges Service_
satisfaction by offering good customer support is important.
Fiber optic
0 0.229 14.77 81.09 0.782 To entice long-term commitment from Clusters 0 and 3
1 0.218 58.58 93.28 0.708 that have high churn rates, offer personalized
2 0.078 54.17 33.96 0 recommendations, promotional offers, and value-added
3 0.071 10.59 32.64 0 services, pay attention to customer reviews, and provide
support. These clusters are the focus and effort must be put
TABLE II (B)- COMPARISON OF CLUSTERS SUMMARY into these clusters to fulfill the goal of retention.
Cluster Gender Partner Dependents Internet
Male Yes _Yes Service_
No
0 0
0.494 0.361 0.199
1 0.502 0.681 0.329 0
2 0.501 0.629 0.437 0.576
3 0.523 0.316 0.303 0.494

TABLE II (C)- - COMPARISON OF CLUSTERS SUMMARY

Cluster Phone Service_ StreamingTV_No Churn_


Yes internet service Yes
0 0.994 0 0.491
1 0.988 0 0.159 Fig. 3. Customer Retention Strategies
2 0.748 0.576 0.048
3 0.797 0.494 0.245 V. CONCLUSION
It is important to note that customer retention is an
An overview of each cluster's properties as determined by iterative process. After devising effective strategies to
the segmentation analysis is given in the table above. The combat churn, it is important to study whether the
table displays the average values of different characteristics implementation was successful, whether it managed to retain
for every cluster. customers or not, and whether further actions should be
It is observed that customers in Cluster 0 have taken accordingly.
comparatively high monthly fees, shorter tenure, and a high This paper managed to identify Logistic Regression as
churn rate of 49.1%. This cluster has a relatively large the best predictor with 81.76% accuracy, 68.95% precision,
proportion of senior citizens. They tend to take Fiber optic 56.57% recall, and 0.6215 F1 score. The top three models
internet service. identified are Logistic Regression, Deep Neural Network,
Cluster 1 comprises customers with even higher monthly and AdaBoost while the top three features that contribute to
charges, long tenure, a relatively high proportion of senior churn are- Tenure, Internet Service Fiber optic, and Internet
citizens, and a lower churn rate of 15.9%. Additionally, they Service DSL. Four customer clusters with different
favor Fiber-optic internet access. characteristics are formed and analyzed.
Customers in Cluster 2 may be content, long-term clients The insights from this paper can be used to formulate and
because they have reduced monthly costs, no fiber optic implement effective strategies to reduce customer attrition
service, and a low churn rate of 4.8%. This cluster has the and have fulfilled the aim of identifying customers at risk of
lowest proportion of senior citizens. A significant percentage churn. The dataset utilized in this research was relatively
of consumers have dependants (43.7%). small which could pose a limitation.
Finally, customers in Cluster 3 have shorter contract REFERENCES
terms, pay less each month, and do not get fiber-optic service. [1] S. A. Qureshi, A. S. Rehman, A. M. Qamar, A. Kamal, and A.
Their intermediate rate of customer churn indicates a Rehman, “Telecommunication subscribers’ churn prediction model
combination of happy and unhappy clients. Similar to using machine learning,” in Eighth International Conference on
cluster 2, this cluster has a low proportion of senior citizens. Digital Information Management (ICDIM 2013), Islamabad,
Pakistan, 2013.
Utilizing these insights, marketing plans and client retention [2] D. V. Poel and B. Larivi, “Customer attrition analysis for financial
initiatives can be customized for every consumer segment. services using proportional hazard models,” European Journal of
Operational Research, vol. 157, no. 1, pp. 196–217, 2004.
Since Cluster 2 has the lowest churn rate, we want to [3] Y. Suh, “Machine learning based customer churn prediction in home
continue encouraging loyalty. This can be done by providing appliance rental business,” J. Big Data, vol. 10, no. 1, Apr. 2023.

538

Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 22,2024 at 16:30:33 UTC from IEEE Xplore. Restrictions apply.
2024 International Conference on Communication, Computer Sciences and Engineering (IC3SE)

[4] “Predicting customer churn in mobile networks through analysis of [13] S. Khodabandehlou and M. Zivari Rahman, “Comparison of
social groups Yossi Richter,” in and Noam Slonim Proceedings of supervised machine learning techniques for customer churn
the 2010 SIAM International Conference on Data Mining (SDM), prediction based on analysis of customer behavior,” J. Syst. Inf.
2010, pp. 732–741. Technol., vol. 19, no. 1/2, pp. 65–93, Mar. 2017.
[5] X. Zhang, J. Zhu, S. Xu, and Y. Wan, “Predicting customer churn [14] T. Vafeiadis, K. I. Diamantaras, G. Sarigiannidis, and K. C.
through interpersonal influence,” Knowl. Based Syst., vol. 28, pp. Chatzisavvas, “A comparison of machine learning techniques for
97–104, Apr. 2012. customer churn prediction,” Simul. Model. Pract. Theory, vol. 55, pp.
[6] M. Pondel et al., “Deep learning for customer churn prediction in E- 1–9, Jun. 2015.
commerce decision support,” Bus. Inf. Sys., pp. 3–12, Jul. 2021. [15] https://www.kaggle.com/datasets/blastchar/telco- customer-churn/data
[7] B. Prabadevi, R. Shalini, and B. R. Kavitha, “Customer churning [16] T. W. Edgar and D. O. Manz, Research methods for cyber security.
analysis using machine learning algorithms,” International Journal Rockland, MA: Syngress Media, 2017.
of Intelligent Networks, vol. 4, pp. 145–154, 2023. [17] Deep Learning for Sustainable Agriculture A volume in Cognitive
[8] P. Lalwani, M. K. Mishra, J. S. Chadha, and P. Sethi, “Customer Data Science in Sustainable Computing. .
churn prediction system: a machine learning approach,” Computing, [18] S. Misra and H. Li, “Noninvasive fracture characterization based on
vol. 104, no. 2, pp. 271–294, Feb. 2022. the classification of sonic wave travel times,” in Machine Learning
[9] B. Zhang, “Customer churn in subscription business model— for Subsurface Characterization, Elsevier, 2020, pp. 243–287.
predictive analytics on customer churn,” BCP Business & [19] K. Kempf-Leonard, Ed., Encyclopedia of social measurement. San
Management, vol. 44, pp. 870–876, Apr. 2023. Diego, CA: Academic Press, 2004.
[10] Q. Tang, G. Xia, X. Zhang, and F. Long, “A customer churn [20] Boosting Algorithm to Handle Unbalanced Classification of PM2.5
prediction model based on XGBoost and MLP,” in 2020 Concentration Levels by Observing Meteorological Parameters in
International Conference on Computer Engineering and Application Jakarta-Indonesia Using AdaBoost. XGBoost, CatBoost, and
(ICCEA), Guangzhou, China, 2020. LightGBM.
[11] P. Jayaswal, B. R. Prasad, D. Tomar, and S. Agarwal, “An ensemble [21] Y. Lou et al., “Individualized empirical baselines for evaluating the
approach for efficient churn prediction in telecom industry,” Int. J. energy performance of existing buildings,” Sci. Technol. Built
Database Theory Appl., vol. 9, no. 8, pp. 211–232, Aug. 2016. Environ., vol. 29, no. 1, pp. 19–33, Jan. 2023.
[12] K. Sharmila, A. A. Wagh, K. S. Andhale, J. R. Wagh, S. P. Pansare, [22] L. Alzubaidi et al., “Review of deep learning: concepts, CNN
and S. H. Ambadekar, “Customer churn prediction in telecom sector architectures, challenges, applications, future directions,” J. Big
using machine learning techniques,” Results in Control and Data, vol. 8, no. 1, Mar. 2021.
Optimization, vol. 14, 2024.

539

Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 22,2024 at 16:30:33 UTC from IEEE Xplore. Restrictions apply.

You might also like