0% found this document useful (0 votes)

31 views19 pages

Chap10 LogisticRegression

Uploaded by

Alison Wang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views19 pages

Chap10 LogisticRegression

Uploaded by

Alison Wang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

Chapter 10 – Logistic

Regression

Data Mining for Business

Intelligence
Shmueli, Patel & Bruce
© Galit Shmueli and Peter Bruce 2010
Logistic Regression
 Powerful model-based classification tool
 Extends idea of linear regression to situation
where outcome variable is categorical
Model relates predictors with the outcome
Example: Y denotes recommendation on
holding/selling/buying a stock – categorical
variable with 3 categories
 We focus on binary classification, i.e. Y=0 or
Y=1 but predictors can be categorical or
continuous
 Widely used, particularly where a structured
model is useful
The Logit
Goal: Find a function of the predictor variables
that relates them to a 0/1 outcome

 Instead of Y as outcome variable (like in linear

regression), we use a function of Prob(Y=1)
called the logit
 Logit can be modeled as a linear function of the
predictors
 The logit can be mapped back to a probability,
which, in turn, can be mapped to a class
Using cut-off value on the probability of belonging
to class 1, P(Y=1)
From MLR to Logistic
Regression
How to make them match?

Logistic
Regression!
Another format

Equation 10.2 in textbook

Step 2: The Odds

The odds of an event are defined as:

p
eq. 10.3 Odds  p = probability of
1 p event

Or, given the odds of an event, the probability

of the event can be computed by:

eq. Odds
p
10.4 1  Odds
We can also relate the Odds to
the predictors:

 0  1 x1   2 x2   q xq
eq. 10.5 Odds e

Recall that:
Step 3: Take log on both
sides
• This gives us the logit:

eq. 10.6

• Log(odds) is called the logit and it takes

values from –∞ to +∞
• Logit is the dependent variable, and is a linear
function of the predictors x1, x2, …, xq
• Helps make interpretations easier
Example: Acceptance of
Personal Loan Offer
Outcome variable: accept bank loan (0/1)

Predictors: Demographic (age, income, etc.), and

information about their bank relationship (mortgage,
securities account, etc.)

Data: 5000 customers – 480 (9.6%) accepted the loan

offer previously

Goal: find characteristics of customers who are most

likely to accept loan offer in future mailings
Data preprocessing
Partition 60% training, 40% validation
Create 0/1 dummy variables for categorical
predictors
Single Predictor Model
Modeling loan acceptance on income (x)

Fitted coefficients: b0 = -6.3525, b1 = 0.0392

Last step - classification
 Model produces an estimated probability of being a
“1”
Example: P(accept loan|income)
 Convert to a classification by establishing cutoff level
 If estimated prob. > cutoff, classify as “1”
 Thus model helps in classification as well as
predicting the probability of belonging to one class
 Default cut-off value: 0.50 but can be changed to:
Maximize classification accuracy
Example: Parameter
estimation
Estimates of ’s are derived through an
iterative process called maximum
likelihood estimation

Let us include all 12 predictors in the model

now
Estimated Equation for Logit

• Interpreting binary predictor effects:

• The odds of accepting the loan offer for those who already have a CD
account with the bank is 32.1 times as the odds of accepting the loan
offer for those who do not have a CD account (p value < 0.001).

• Interpreting continuous predictor effects:

• The odds of accepting the loan offer increases by 77.1% if the family
size increases by one (p value < 0.001).
• The odds of accepting the loan offer decreases by 4.4% if a client is 1
year older (p value = 0.624).
[Link]
Variable Selection
Problems:
As in linear regression, correlated predictors
introduce bias in the method
 Overly complex models have the danger of overfitting

Solution: Remove extreme redundancies by

dropping predictors via automated selection of
variable subsets (like linear regressions) or by
data reduction methods such as PCA
P-values for Predictors
 Test null hypothesis that coefficient = 0
P-values with the coefficients display results of
these tests
Coefficients with low p-values (close to 0) are
statistically significant
 Useful for review to determine whether to
include variable in model
 Key in profiling tasks, but less important in
predictive classification
Summary
 Logistic regression is similar to linear
regression, except that it is used with a
categorical response
 It can be used for explanatory tasks
(=profiling) or predictive tasks (=classification)
 The predictors are related to the response Y via
a nonlinear function called the logit
 As in linear regression, reducing predictors can
be done via variable selection
 Logistic regression can be generalized to more
than two classes

Chapter 10 - Logistic Regression: Data Mining For Business Intelligence
No ratings yet
Chapter 10 - Logistic Regression: Data Mining For Business Intelligence
20 pages
Chapter 10 Logistic Reg
No ratings yet
Chapter 10 Logistic Reg
29 pages
BANA 560 Lecture - 4 - LogisticRegression
No ratings yet
BANA 560 Lecture - 4 - LogisticRegression
26 pages
Chap10 Logistic Regression
No ratings yet
Chap10 Logistic Regression
36 pages
Understanding Logistic Regression Basics
100% (1)
Understanding Logistic Regression Basics
41 pages
Chapter 10 Logistic Reg (Python)
No ratings yet
Chapter 10 Logistic Reg (Python)
29 pages
Logistic Regression Tutorial Python
No ratings yet
Logistic Regression Tutorial Python
30 pages
Logistic Regression
No ratings yet
Logistic Regression
20 pages
S4 LogisticRegression 15jan2025
No ratings yet
S4 LogisticRegression 15jan2025
25 pages
Chap4 Logistic Regression
No ratings yet
Chap4 Logistic Regression
40 pages
Logistic Regression
No ratings yet
Logistic Regression
18 pages
Logistic Regression for Coupon Usage
100% (1)
Logistic Regression for Coupon Usage
56 pages
Linear Regression and Logit
No ratings yet
Linear Regression and Logit
15 pages
Logistic Regresson
No ratings yet
Logistic Regresson
32 pages
Understanding Logistic Regression
100% (3)
Understanding Logistic Regression
41 pages
Logisticregression
No ratings yet
Logisticregression
22 pages
Chapter 10 Logistic Reg - Week 07 - 01
No ratings yet
Chapter 10 Logistic Reg - Week 07 - 01
31 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
ML2 Logistic Regression
No ratings yet
ML2 Logistic Regression
23 pages
Lecture 6 Logistic Regression
No ratings yet
Lecture 6 Logistic Regression
28 pages
Topic 7 Regression (Cont2) Logistic Regression
No ratings yet
Topic 7 Regression (Cont2) Logistic Regression
33 pages
Nisha Arora - Logistics Regression Using SPSS
No ratings yet
Nisha Arora - Logistics Regression Using SPSS
76 pages
Lesson 13 Logistic Regression
No ratings yet
Lesson 13 Logistic Regression
26 pages
Binary Logistic
No ratings yet
Binary Logistic
29 pages
Machine Learning for Mechanics
No ratings yet
Machine Learning for Mechanics
19 pages
Logistic Regression Insights
No ratings yet
Logistic Regression Insights
49 pages
Logistic Regression for Analysts
No ratings yet
Logistic Regression for Analysts
33 pages
Module 2
No ratings yet
Module 2
92 pages
Logistic Regression
No ratings yet
Logistic Regression
27 pages
Logistic Regression in Machine Learning
No ratings yet
Logistic Regression in Machine Learning
4 pages
Reference Material Logistic Regression
No ratings yet
Reference Material Logistic Regression
11 pages
Understanding Logistic Regression Basics
No ratings yet
Understanding Logistic Regression Basics
37 pages
Logistic Regression Basics
No ratings yet
Logistic Regression Basics
13 pages
Logistic Regression
No ratings yet
Logistic Regression
54 pages
Logistic Regression Monograph
No ratings yet
Logistic Regression Monograph
33 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Regression3 Slides
No ratings yet
Regression3 Slides
47 pages
Logistic Regression
No ratings yet
Logistic Regression
14 pages
Logistic Regression
No ratings yet
Logistic Regression
72 pages
Chp2 Logistic Regression
No ratings yet
Chp2 Logistic Regression
6 pages
13 Logistic Regression Main
No ratings yet
13 Logistic Regression Main
14 pages
Logistic Regression
100% (1)
Logistic Regression
37 pages
Logistic Regression Monograph - DSBA v2
No ratings yet
Logistic Regression Monograph - DSBA v2
54 pages
Class
No ratings yet
Class
102 pages
Lec 16 - Logistic Regression
No ratings yet
Lec 16 - Logistic Regression
11 pages
Reference Material - Logistic - Regression
No ratings yet
Reference Material - Logistic - Regression
11 pages
Logistic Regression for Business Insights
No ratings yet
Logistic Regression for Business Insights
18 pages
Reference Material - Logistic - Regression
No ratings yet
Reference Material - Logistic - Regression
11 pages
Logistic Regression
No ratings yet
Logistic Regression
4 pages
LPM, Logit and Probit Models
No ratings yet
LPM, Logit and Probit Models
21 pages
09 23ECE216 LogisticRegression
No ratings yet
09 23ECE216 LogisticRegression
40 pages
ML Unit 3
No ratings yet
ML Unit 3
40 pages
Logistic Regression
100% (2)
Logistic Regression
30 pages
BA TopicB LoR
No ratings yet
BA TopicB LoR
29 pages
L9 Logistical Regression Models Updated
No ratings yet
L9 Logistical Regression Models Updated
10 pages
Lecture 22. GLM
No ratings yet
Lecture 22. GLM
41 pages
Chapter 5-LDVM-2024
No ratings yet
Chapter 5-LDVM-2024
27 pages
Business Analytics & Machine Learning: Logistic and Poisson Regressions
No ratings yet
Business Analytics & Machine Learning: Logistic and Poisson Regressions
62 pages
Abandoned Mine As Bat Habitat PDF
No ratings yet
Abandoned Mine As Bat Habitat PDF
3 pages
Analyzing Process Control Issues at GATI
86% (7)
Analyzing Process Control Issues at GATI
5 pages
Npointer 2.0: Product Documentation
No ratings yet
Npointer 2.0: Product Documentation
5 pages
Family, Lawyers Sue Apartment Owners After 22-Year-Old Killed in Attempted Dognapping
No ratings yet
Family, Lawyers Sue Apartment Owners After 22-Year-Old Killed in Attempted Dognapping
11 pages
Check Point fw monitor Cheat Sheet
No ratings yet
Check Point fw monitor Cheat Sheet
2 pages
Fish PDF
No ratings yet
Fish PDF
70 pages
Channel Letter Guide
No ratings yet
Channel Letter Guide
4 pages
Financial Advisory Services
No ratings yet
Financial Advisory Services
14 pages
PLSQL Web Toolkit
No ratings yet
PLSQL Web Toolkit
31 pages
Design of A Eco-Friendly City
No ratings yet
Design of A Eco-Friendly City
31 pages
101 Seafood Grand Opening Menu
No ratings yet
101 Seafood Grand Opening Menu
2 pages
FRL 2013 Study Guide Extra Credit
No ratings yet
FRL 2013 Study Guide Extra Credit
2 pages
Dealers
100% (2)
Dealers
2 pages
Project Report Template
No ratings yet
Project Report Template
11 pages
Sesi 2 - Evaluasi Kinerja Media Sosial
No ratings yet
Sesi 2 - Evaluasi Kinerja Media Sosial
58 pages
Lease vs. Purchase Analysis
0% (1)
Lease vs. Purchase Analysis
4 pages
Windows 7 ISO File Variants List
No ratings yet
Windows 7 ISO File Variants List
2 pages
Parent Consent Letter for UK Visa
100% (2)
Parent Consent Letter for UK Visa
1 page
Effects of Psychological Distance and Social Influence On Tourists Hotel Booking Preferences
No ratings yet
Effects of Psychological Distance and Social Influence On Tourists Hotel Booking Preferences
19 pages
Legal Dispute: Seneca vs. Cybernet
No ratings yet
Legal Dispute: Seneca vs. Cybernet
27 pages
Training of Mhps-Tomoni For GSRC
100% (5)
Training of Mhps-Tomoni For GSRC
22 pages
Mckinsey 2016 Digital-By-Default-A-Guide-To-Transforming-Government-Final
No ratings yet
Mckinsey 2016 Digital-By-Default-A-Guide-To-Transforming-Government-Final
13 pages
Wft009281 CWD Bit
No ratings yet
Wft009281 CWD Bit
4 pages
Website Development Checklist Guide
No ratings yet
Website Development Checklist Guide
2 pages
Stages of Taxes
No ratings yet
Stages of Taxes
2 pages
Chapter 25 NG Buhay Ko
No ratings yet
Chapter 25 NG Buhay Ko
12 pages
Understanding Notice of Dishonor
No ratings yet
Understanding Notice of Dishonor
37 pages
An Analysis of Tigrays Fractured Landscape in The Wake of Genocide
No ratings yet
An Analysis of Tigrays Fractured Landscape in The Wake of Genocide
31 pages
132kV Power Transformer
No ratings yet
132kV Power Transformer
7 pages
Presentation Group 2
No ratings yet
Presentation Group 2
50 pages

Chap10 LogisticRegression

Uploaded by

Chap10 LogisticRegression

Uploaded by

Chapter 10 – Logistic

Data Mining for Business

 Instead of Y as outcome variable (like in linear

Equation 10.2 in textbook

The odds of an event are defined as:

Or, given the odds of an event, the probability

• Log(odds) is called the logit and it takes

Predictors: Demographic (age, income, etc.), and

Data: 5000 customers – 480 (9.6%) accepted the loan

Goal: find characteristics of customers who are most

Fitted coefficients: b0 = -6.3525, b1 = 0.0392

Let us include all 12 predictors in the model

• Interpreting binary predictor effects:

• Interpreting continuous predictor effects:

Solution: Remove extreme redundancies by

You might also like