0% found this document useful (0 votes)

92 views13 pages

Optimizing Lead Conversion for X Education

The document discusses building a model to identify hot leads for an online education company. It covers data cleaning, EDA, feature engineering, building a logistic regression model, and evaluating the model performance. Key variables that impacted potential buyers were time on website, number of visits, lead source, last activity, lead origin, and current occupation.

Uploaded by

mohd ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

92 views13 pages

Optimizing Lead Conversion for X Education

Uploaded by

mohd ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Group Members:

1. Siddharth Sagar
2. Mohammad Ali
 X Education sells online courses to industry professionals.
 X Education gets a lot of leads, its lead conversion rate is very poor. For
example, if, say, they acquire 100 leads in a day, only about 30 of them are
converted.
 To make this process more efficient, the company wishes to identify the
most potential leads, also known as ‘Hot Leads’.
 If they successfully identify this set of leads, the lead conversion rate
should go up as the sales team will now be focusing more on
communicating with the potential leads rather than making calls to
everyone.

 Business Objective:
 X education wants to know most promising leads.
 For that they want to build a Model which identifies the hot leads.
 Deployment of the model for the future use.
 Data cleaning and data manipulation.
 1.Check and handle duplicate data.
 2.Check and handle NA values and missing values.
 3.Drop columns, if it contains large amount of missing values and not useful for the
analysis.
 4.Imputation of the values, if necessary.
 5.Check and handle outliers in data.

 EDA
 1.Univariate data analysis: value count, distribution of variable etc.
 2.Bivariate data analysis: correlation coefficients and pattern between the variables
etc.
 Feature Scaling & Dummy Variables and encoding of the data.
 Classification technique: logistic regression used for the model making and prediction.
 Validation of the model.
 Model presentation.
 Conclusions and recommendations
 Total Number of Rows =37, Total Number of Columns =9240.
 Single value features like “Magazine”, “Receive More Updates
About Our Courses”, “Update me on Supply”
 Chain Content”, “Get updates on DM Content”, “I agree to pay
the amount through cheque” etc. have been dropped.
 Removing the “Prospect ID” and “Lead Number” which is not
necessary for the analysis.
 After checking for the value counts for some of the object type
variables, we find some of the features which has no enough
variance, which we have dropped, the features are: “Do Not Call”,
“What matters most to you in choosing course”, “Search”,
“Newspaper Article”, “X Education Forums”, “Newspaper”,
“Digital Advertisement” etc.
 Dropping the columns having more than 35% as missing value
such as ‘How did you hear about X Education’ and ‘Lead Profile’.
 Numerical Variables are Normalised.
 Dummy Variables are created for object type
variables
 Total Rows for Analysis: 8792
 Total Columns for Analysis: 43
 Splitting the Data into Training and Testing Sets
 The first basic step for regression is performing a
train-test split, we have chosen 70:30 ratio.
 Use RFE for Feature Selection
 Running RFE with 15 variables as output
 Building Model by removing the variable whose p-
value is greater than 0.05 and vifvalue is greater
than 5
 Predictions on test data set
 Overall accuracy 81%
Finding Optimal Cut off Point
•Optimal cut off probability is that
•probability where we get balanced sensitivity and specificity.
•From the second graph it is visible that the optimal cut off is at 0.35.
 It was found that the variables that mattered the most in the potential buyers are (In
descending order) :
 The total time spend on the Website.
 Total number of visits.
 When the lead source was:
 a. Google
 b. Direct traffic
 c. Organic search
 d. Welingak website
 When the last activity was:
 a. SMS
 b. Olark chat conversation
 When the lead origin is Lead add format.
 When their current occupation is as a working professional.
 Keeping these in mind the X Education can flourish as they have a very high chance to
get almost all the potential buyers to change their mind and buy their courses.

Lead Conversion Strategies and Metrics
No ratings yet
Lead Conversion Strategies and Metrics
3 pages
Salary Analysis by Education & Occupation
100% (1)
Salary Analysis by Education & Occupation
5 pages
SMDM Project Report-Survi Ghura
100% (1)
SMDM Project Report-Survi Ghura
26 pages
MCD2080 Business Statistics Group Assignment-Final
No ratings yet
MCD2080 Business Statistics Group Assignment-Final
5 pages
Lead Conversion Strategies for Sales Teams
No ratings yet
Lead Conversion Strategies for Sales Teams
1 page
Lead Scoring for X Education
No ratings yet
Lead Scoring for X Education
24 pages
Insurance Claim Prediction Models Analysis
67% (3)
Insurance Claim Prediction Models Analysis
33 pages
Note CW Savings
No ratings yet
Note CW Savings
8 pages
Project 1 Austo Automobiles
No ratings yet
Project 1 Austo Automobiles
10 pages
Descriptive Analytics and Statistics Guide
No ratings yet
Descriptive Analytics and Statistics Guide
9 pages
CMSU Student Survey Analysis
No ratings yet
CMSU Student Survey Analysis
10 pages
Gulf View vs Non-Gulf View Condos Analysis
No ratings yet
Gulf View vs Non-Gulf View Condos Analysis
22 pages
CRISP DM Business Understanding Completed
No ratings yet
CRISP DM Business Understanding Completed
18 pages
Business Case - Walmark
No ratings yet
Business Case - Walmark
3 pages
Multinomial Problem Statement
No ratings yet
Multinomial Problem Statement
28 pages
Descriptive Statistics for Analytics
100% (1)
Descriptive Statistics for Analytics
3 pages
Data Analysis and Linear Regression Insights
No ratings yet
Data Analysis and Linear Regression Insights
3 pages
Customer Satisfaction Regression Analysis
No ratings yet
Customer Satisfaction Regression Analysis
25 pages
11 Network Analytics - Problem Statement
25% (4)
11 Network Analytics - Problem Statement
4 pages
DE THI TIENG ANH CAO HỌC KINH TE NAM 2020 - DOT 2
No ratings yet
DE THI TIENG ANH CAO HỌC KINH TE NAM 2020 - DOT 2
4 pages
Wholesale Customer & CMSU Survey Analysis
100% (1)
Wholesale Customer & CMSU Survey Analysis
20 pages
Machine Learning: Transport Type Prediction
100% (1)
Machine Learning: Transport Type Prediction
12 pages
Predicting Employee Transport Choices
No ratings yet
Predicting Employee Transport Choices
17 pages
Spending Analysis of Retail Channels in Portugal
100% (1)
Spending Analysis of Retail Channels in Portugal
17 pages
Amazon Interview Questions Overview
No ratings yet
Amazon Interview Questions Overview
175 pages
Airline and Crime Data Clustering Analysis
100% (1)
Airline and Crime Data Clustering Analysis
9 pages
Inventory & Forecasting Analysis
No ratings yet
Inventory & Forecasting Analysis
12 pages
304 BA - Advanced Statistical Methods Using R Notes Till Unit 2
No ratings yet
304 BA - Advanced Statistical Methods Using R Notes Till Unit 2
34 pages
Customer Segmentation Clustering Analysis
100% (1)
Customer Segmentation Clustering Analysis
16 pages
Kohli Batting Analysis
No ratings yet
Kohli Batting Analysis
19 pages
GMAT Viettel Recruitment Questions
No ratings yet
GMAT Viettel Recruitment Questions
9 pages
BT Fin202
No ratings yet
BT Fin202
191 pages
Graphical Excellence Guide
No ratings yet
Graphical Excellence Guide
13 pages
(Sb-t22324pwb-4) Group 2 - Group Assignment
No ratings yet
(Sb-t22324pwb-4) Group 2 - Group Assignment
21 pages
Employee Transport Mode Prediction
100% (1)
Employee Transport Mode Prediction
22 pages
Top Data Science Program in India
No ratings yet
Top Data Science Program in India
16 pages
Customer Churn Prediction Strategies
No ratings yet
Customer Churn Prediction Strategies
33 pages
Wholesale Customer Spending Analysis
100% (1)
Wholesale Customer Spending Analysis
9 pages
Python Project Submission by - Ravikanth Govindu: Due Date: 27th Mar 2022
No ratings yet
Python Project Submission by - Ravikanth Govindu: Due Date: 27th Mar 2022
48 pages
R09 Probability Concepts Q Bank
No ratings yet
R09 Probability Concepts Q Bank
20 pages
(Rbac 2024) Round 1 Case Study B
No ratings yet
(Rbac 2024) Round 1 Case Study B
12 pages
Annual Spending Analysis of Retailers in Portugal
No ratings yet
Annual Spending Analysis of Retailers in Portugal
12 pages
Regression Analysis Exam Questions
No ratings yet
Regression Analysis Exam Questions
566 pages
Maison Singulier
No ratings yet
Maison Singulier
18 pages
Advanced Statistics Business Report CMSU
No ratings yet
Advanced Statistics Business Report CMSU
25 pages
CMSU Student Survey Analysis
No ratings yet
CMSU Student Survey Analysis
16 pages
Arima Model
No ratings yet
Arima Model
30 pages
Wholesale Customer Spending Analysis
100% (1)
Wholesale Customer Spending Analysis
15 pages
SMDM Project Solved
0% (1)
SMDM Project Solved
27 pages
Web Analytics at Quality Alloys Inc
No ratings yet
Web Analytics at Quality Alloys Inc
14 pages
SQL Query Analysis Guide
No ratings yet
SQL Query Analysis Guide
14 pages
Lead Score Case Study
No ratings yet
Lead Score Case Study
13 pages
Boosting Lead Conversion Rates
No ratings yet
Boosting Lead Conversion Rates
13 pages
Lead Score Case Study
No ratings yet
Lead Score Case Study
13 pages
Lead Score Case Study - Presentation
33% (3)
Lead Score Case Study - Presentation
17 pages
Lead Score Case Study
No ratings yet
Lead Score Case Study
9 pages
Identifying Hot Leads for Conversion
No ratings yet
Identifying Hot Leads for Conversion
13 pages
Exploratory Data Analysis Techniques
No ratings yet
Exploratory Data Analysis Techniques
29 pages
Cse Machine Learning Lab Manual
No ratings yet
Cse Machine Learning Lab Manual
22 pages
Time Series Analysis in R A Beginner's Guide
No ratings yet
Time Series Analysis in R A Beginner's Guide
13 pages
Data Analytics & Mining Guide
No ratings yet
Data Analytics & Mining Guide
22 pages
1stSemAY2021-2022 - OBE FLEXIBLE LEARNING COURSE PLAN FOR BAC 04-18 APPLIED STATISTICS IN BUSINESS
No ratings yet
1stSemAY2021-2022 - OBE FLEXIBLE LEARNING COURSE PLAN FOR BAC 04-18 APPLIED STATISTICS IN BUSINESS
17 pages
Evaluation of Oral English Textbooks
No ratings yet
Evaluation of Oral English Textbooks
9 pages
Data Mining and Warehousing Q&A Guide
No ratings yet
Data Mining and Warehousing Q&A Guide
13 pages
Variable Control Charts Guide
No ratings yet
Variable Control Charts Guide
1 page
Data Science Challenges and Techniques
No ratings yet
Data Science Challenges and Techniques
2 pages
1/ Mô Hình Log-Log Lnyi Lnb1 + B2.Lnxi + Ui
No ratings yet
1/ Mô Hình Log-Log Lnyi Lnb1 + B2.Lnxi + Ui
3 pages
Adolescent Support Networks in Chronic Illness
No ratings yet
Adolescent Support Networks in Chronic Illness
7 pages
Amsalu Huluka MBA 2010
100% (1)
Amsalu Huluka MBA 2010
84 pages
Data Literacy Class - 9
No ratings yet
Data Literacy Class - 9
33 pages
Lampiran 1 Perhitungan Pembuatan Konsentrasi Ekstrak: Universitas Sumatera Utara
No ratings yet
Lampiran 1 Perhitungan Pembuatan Konsentrasi Ekstrak: Universitas Sumatera Utara
30 pages
One-Way ANOVA Analysis Results
No ratings yet
One-Way ANOVA Analysis Results
3 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
8 pages
CH - 12 - Serial Correlation and Heteroskedasticity in Time Series Regressions
No ratings yet
CH - 12 - Serial Correlation and Heteroskedasticity in Time Series Regressions
19 pages
Impact of PhotoMath on Math Scores
100% (1)
Impact of PhotoMath on Math Scores
39 pages
Technical Reading and Writing Skills Guide
No ratings yet
Technical Reading and Writing Skills Guide
17 pages
ANOVA Analysis for 690 UAH Treatment
No ratings yet
ANOVA Analysis for 690 UAH Treatment
2 pages
Exploratory Data Analysis (EDA) Checklist
No ratings yet
Exploratory Data Analysis (EDA) Checklist
2 pages
Granger Causality in Time Series
No ratings yet
Granger Causality in Time Series
9 pages
Ba ZG512 Ec-2r First Sem 2024-2025
No ratings yet
Ba ZG512 Ec-2r First Sem 2024-2025
12 pages
Final Project Report
No ratings yet
Final Project Report
34 pages
Stata Commands for Data Analysis
No ratings yet
Stata Commands for Data Analysis
10 pages
Adimasu Ambo L 2016
No ratings yet
Adimasu Ambo L 2016
42 pages
Summer Project Report
100% (1)
Summer Project Report
42 pages
3 Unit (1) - Merged
No ratings yet
3 Unit (1) - Merged
22 pages
Organisational Diagnosis Methodologies
No ratings yet
Organisational Diagnosis Methodologies
4 pages
Individual Assignment - Managerial Accounting - Keyd M
No ratings yet
Individual Assignment - Managerial Accounting - Keyd M
4 pages

Optimizing Lead Conversion for X Education

Uploaded by

Optimizing Lead Conversion for X Education

Uploaded by

Group Members:

You might also like