SlideShare a Scribd company logo
Logistic Regression
Jacquelyn Victoria & Tamer Wahba
1
Slide Ownership
Jacquelyn Victoria - 3 to 9
Tamer Wahba - 10 to 15
2
Regression
Analysis +
Classification
How can we predict a nominal class
using regression analysis?
Consider a binary class:
Each instance x is a vector of feature
values
Our output values or class labels are
restricted to 0 or 1, i.e. f(x) ∈ {0, 1}
We need an h(x) where: 0 < h(x) < 1
We need a function which exhibits this
behavior
3
Logistic
Functions Sigmoid Function σ(x)
Asymptotes at y = 1 and y = 0
Easy to specify threshold (σ(0) = .5)
Results are P(y=1)
As a result:
Where θ is a vector of weights
4
Cost Function
Need to find hθ(x) that is a logistic
function that represents our data
Need to find θ to fit our data
-log(1-x)-log(x)
5
Gradient
Descent
In order to find the minimum, we can
use the partial derivative of J(θ)
do {
}until θ converges
Where α is the learning rate (almost
always between 0 and 1, .1-.3 usually
a good range)
6
Maximum Likelihood Estimation
7
do {
}until θ converges
Can also be calculated using:
Iteratively Reweighted Least Squares
Multinomial data uses Softmax Regression
Interpreting
hypothesis
8
Recall that σ(0) = .5 and that hθ(x) = σ(θTx)
x1
x2
Interpreting hθ
I want to create a model to give me the
probability that I will pass a test given how
many hours I have studied
Hours 0.50 0.75 1.00 1.25 1.50 1.75 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 4.00 4.25 4.50 4.75 5.00 5.50
Pass 0 0 0 0 0 0 1 0 1 0 1 0 1 0 1 1 1 1 1 1
Using this generated model, calculate my probability
of passing given I have studied 3 hours
P(passing| study time = 3) = .61
9source
Logistic
Regression
Compared to
Other Classifiers
Naive Bayes
Support Vector Machines
Decision Trees
10
vs Decision Tree
Assumptions
DT: decision boundaries parallel to axes
LR: one smooth boundary
Decision trees can be used when there are
multiple decision boundaries
11
Feature Weights
NB: each set independently depending on class
LR: together such that decision function tends to be high for positive classes and low for negative
classes
Correlated features have no effect on logistic regression
vs Naive Bayes
12
vs Support Vector Machine
13
Both attempt to find hyperplane separating training samples
SVM: find the solution with maximum margin
LR: find any solution that separates the instances
SVM is a hard classified while LR is probabilistic
Advantages
Works well with diagonal decision boundaries
Does not give undue weight to correlated
features
Probabilistic outcomes
14
Requires large sample size for stable results
Disadvantages
Use Cases
Categorical outcomes
Large sample data
Minimal preprocessing
15
For more info...
Helpful links to go into more
depth with Logistic Regression
Stanford Open Course (Logit
regression section)
Logit Regression Tutorial (exercises in
MATLAB)
Logit Regression Tutorial (no code)
How to use Logit Regression in Python
How to use Logit Regression in R
How to use Logit Regression in Java
using Weka
16

More Related Content

PPTX
Constraint satisfaction problems (csp)
PPTX
Daa unit 3
PPT
Lect6 csp
PPT
Chapter 17
PDF
Linear Discriminant Analysis and Its Generalization
PPTX
Ms nikita greedy agorithm
PDF
25 String Matching
PPTX
unit-4-dynamic programming
Constraint satisfaction problems (csp)
Daa unit 3
Lect6 csp
Chapter 17
Linear Discriminant Analysis and Its Generalization
Ms nikita greedy agorithm
25 String Matching
unit-4-dynamic programming

What's hot (20)

PPT
Chapter 24 aoa
PPTX
daa-unit-3-greedy method
PPT
2.7 other classifiers
PDF
06. string matching
PPT
lecture 26
PPTX
Deep learning paper review ppt sourece -Direct clr
PPTX
Limit of Function Mathematic
PPT
Greedy Algorithm
PPTX
Greedy algorithms -Making change-Knapsack-Prim's-Kruskal's
PPT
Chapter 10 ds
PPT
String matching algorithms
PPTX
Theory of computation Lec3 dfa
PPTX
Fuzzy Logic_HKR
PDF
An overview of Hidden Markov Models (HMM)
PPTX
Basic python part 1
PDF
Tutorial 7: Transpose of matrices and Its properties
PDF
Skiena algorithm 2007 lecture16 introduction to dynamic programming
Chapter 24 aoa
daa-unit-3-greedy method
2.7 other classifiers
06. string matching
lecture 26
Deep learning paper review ppt sourece -Direct clr
Limit of Function Mathematic
Greedy Algorithm
Greedy algorithms -Making change-Knapsack-Prim's-Kruskal's
Chapter 10 ds
String matching algorithms
Theory of computation Lec3 dfa
Fuzzy Logic_HKR
An overview of Hidden Markov Models (HMM)
Basic python part 1
Tutorial 7: Transpose of matrices and Its properties
Skiena algorithm 2007 lecture16 introduction to dynamic programming
Ad

Viewers also liked (20)

PPTX
Logistic regression
PDF
Ordinal Logistic Regression
PPTX
Logistic regression
PPTX
Logistic regression with SPSS examples
PDF
Logistic regression
PPT
Logistic regression (blyth 2006) (simplified)
PPTX
PPT
Regression analysis ppt
PDF
Logistic Regression Analysis
PDF
Spss course session-II
PDF
Logistic regression sage
PDF
Technique Presentation
PPT
MRA vs AVM
PPTX
Module5.slp
PDF
Boosted Tree-based Multinomial Logit Model for Aggregated Market Data
PDF
BigML Summer 2016 Release
PDF
BSSML16 L2. Ensembles and Logistic Regressions
PDF
Logistic Regression/Markov Chain presentation
PPT
Multinomial logisticregression basicrelationships
PDF
Transparency7
 
Logistic regression
Ordinal Logistic Regression
Logistic regression
Logistic regression with SPSS examples
Logistic regression
Logistic regression (blyth 2006) (simplified)
Regression analysis ppt
Logistic Regression Analysis
Spss course session-II
Logistic regression sage
Technique Presentation
MRA vs AVM
Module5.slp
Boosted Tree-based Multinomial Logit Model for Aggregated Market Data
BigML Summer 2016 Release
BSSML16 L2. Ensembles and Logistic Regressions
Logistic Regression/Markov Chain presentation
Multinomial logisticregression basicrelationships
Transparency7
 
Ad

Similar to Intro to Logistic Regression (20)

PDF
Module -6.pdf Machine Learning Types and examples
PPTX
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
PPTX
lec+5+_part+1 cloud .pptx
PPTX
MACHINE LEARNING Unit -2 Algorithm.pptx
PPTX
Lecture 3.1_ Logistic Regression powerpoint
PDF
Classification Techniques for Machine Learning
PPTX
Logistic Regression power point presentation.pptx
PPTX
Lecture 3.1_ Logistic Regression.pptx
PPTX
Classification Algortyhm of Machine Learning
PDF
Logistic-Regression - Machine learning model
PDF
Logistic regression, machine learning algorithms
PDF
Machine Learning with Python- Machine Learning Algorithms- Logistic Regressio...
PDF
Logistic Regression Classifier - Conceptual Guide
PDF
Logistic regression in Machine Learning
PDF
Lecture 6 - Logistic Regression, a lecture in subject module Statistical & Ma...
PPTX
Logistic-regression-Supervised-MachineLearning.pptx
PPTX
Logistic regression
PPTX
logistic_regression_intro_simple_lang.pptx
PPTX
Logistic Regression in machine learning ppt
PPTX
Lec05.pptx
Module -6.pdf Machine Learning Types and examples
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
lec+5+_part+1 cloud .pptx
MACHINE LEARNING Unit -2 Algorithm.pptx
Lecture 3.1_ Logistic Regression powerpoint
Classification Techniques for Machine Learning
Logistic Regression power point presentation.pptx
Lecture 3.1_ Logistic Regression.pptx
Classification Algortyhm of Machine Learning
Logistic-Regression - Machine learning model
Logistic regression, machine learning algorithms
Machine Learning with Python- Machine Learning Algorithms- Logistic Regressio...
Logistic Regression Classifier - Conceptual Guide
Logistic regression in Machine Learning
Lecture 6 - Logistic Regression, a lecture in subject module Statistical & Ma...
Logistic-regression-Supervised-MachineLearning.pptx
Logistic regression
logistic_regression_intro_simple_lang.pptx
Logistic Regression in machine learning ppt
Lec05.pptx

Recently uploaded (20)

PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PDF
Transcultural that can help you someday.
PPTX
A Complete Guide to Streamlining Business Processes
PPTX
New ISO 27001_2022 standard and the changes
PDF
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
PPTX
importance of Data-Visualization-in-Data-Science. for mba studnts
PDF
How to run a consulting project- client discovery
PPTX
CYBER SECURITY the Next Warefare Tactics
PPTX
Introduction to Inferential Statistics.pptx
PPTX
retention in jsjsksksksnbsndjddjdnFPD.pptx
PDF
Business Analytics and business intelligence.pdf
PPTX
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PPTX
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
PPT
ISS -ESG Data flows What is ESG and HowHow
PDF
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
PPTX
Pilar Kemerdekaan dan Identi Bangsa.pptx
PPTX
SAP 2 completion done . PRESENTATION.pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PDF
Introduction to Data Science and Data Analysis
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
Transcultural that can help you someday.
A Complete Guide to Streamlining Business Processes
New ISO 27001_2022 standard and the changes
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
importance of Data-Visualization-in-Data-Science. for mba studnts
How to run a consulting project- client discovery
CYBER SECURITY the Next Warefare Tactics
Introduction to Inferential Statistics.pptx
retention in jsjsksksksnbsndjddjdnFPD.pptx
Business Analytics and business intelligence.pdf
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
Optimise Shopper Experiences with a Strong Data Estate.pdf
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
ISS -ESG Data flows What is ESG and HowHow
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
Pilar Kemerdekaan dan Identi Bangsa.pptx
SAP 2 completion done . PRESENTATION.pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Introduction to Data Science and Data Analysis

Intro to Logistic Regression

  • 2. Slide Ownership Jacquelyn Victoria - 3 to 9 Tamer Wahba - 10 to 15 2
  • 3. Regression Analysis + Classification How can we predict a nominal class using regression analysis? Consider a binary class: Each instance x is a vector of feature values Our output values or class labels are restricted to 0 or 1, i.e. f(x) ∈ {0, 1} We need an h(x) where: 0 < h(x) < 1 We need a function which exhibits this behavior 3
  • 4. Logistic Functions Sigmoid Function σ(x) Asymptotes at y = 1 and y = 0 Easy to specify threshold (σ(0) = .5) Results are P(y=1) As a result: Where θ is a vector of weights 4
  • 5. Cost Function Need to find hθ(x) that is a logistic function that represents our data Need to find θ to fit our data -log(1-x)-log(x) 5
  • 6. Gradient Descent In order to find the minimum, we can use the partial derivative of J(θ) do { }until θ converges Where α is the learning rate (almost always between 0 and 1, .1-.3 usually a good range) 6
  • 7. Maximum Likelihood Estimation 7 do { }until θ converges Can also be calculated using: Iteratively Reweighted Least Squares Multinomial data uses Softmax Regression
  • 8. Interpreting hypothesis 8 Recall that σ(0) = .5 and that hθ(x) = σ(θTx) x1 x2
  • 9. Interpreting hθ I want to create a model to give me the probability that I will pass a test given how many hours I have studied Hours 0.50 0.75 1.00 1.25 1.50 1.75 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 4.00 4.25 4.50 4.75 5.00 5.50 Pass 0 0 0 0 0 0 1 0 1 0 1 0 1 0 1 1 1 1 1 1 Using this generated model, calculate my probability of passing given I have studied 3 hours P(passing| study time = 3) = .61 9source
  • 10. Logistic Regression Compared to Other Classifiers Naive Bayes Support Vector Machines Decision Trees 10
  • 11. vs Decision Tree Assumptions DT: decision boundaries parallel to axes LR: one smooth boundary Decision trees can be used when there are multiple decision boundaries 11
  • 12. Feature Weights NB: each set independently depending on class LR: together such that decision function tends to be high for positive classes and low for negative classes Correlated features have no effect on logistic regression vs Naive Bayes 12
  • 13. vs Support Vector Machine 13 Both attempt to find hyperplane separating training samples SVM: find the solution with maximum margin LR: find any solution that separates the instances SVM is a hard classified while LR is probabilistic
  • 14. Advantages Works well with diagonal decision boundaries Does not give undue weight to correlated features Probabilistic outcomes 14 Requires large sample size for stable results Disadvantages
  • 15. Use Cases Categorical outcomes Large sample data Minimal preprocessing 15
  • 16. For more info... Helpful links to go into more depth with Logistic Regression Stanford Open Course (Logit regression section) Logit Regression Tutorial (exercises in MATLAB) Logit Regression Tutorial (no code) How to use Logit Regression in Python How to use Logit Regression in R How to use Logit Regression in Java using Weka 16

Editor's Notes

  • #5: hθ(x) = σ(θTx)
  • #8: We’re trying to find the most likely theta given our test instances. This can be nondeterministic, Also need stopping criteria.
  • #9: We can actually add terms such as x1^2, or x1*x2^4 to make a non-linear
  • #10: The first entry in our instance vector is always one, due to the weight of the intercept When we say theta transpose x - this means that we transpose theta and then multiply it, using matrix multiplication,
  • #14: Draw SVM diagram on board