SlideShare a Scribd company logo
DA 5230 – Statistical & Machine Learning
Lecture 5 – Logistic Regression
Maninda Edirisooriya
manindaw@uom.lk
Classification
• When the Y variable of a Supervised Learning problem is of several
discreate classes (e.g.: Color, Age groups) the problem is known as a
Classification problem
• A Classification problem has to predict/select a certain Category (or a
Class) as the dependent variable
• When there are only 2 classes to be classified, it is known as a Binary
Classification problem
E.g.: Predicting a person’s gender (either as male or female) by testosterone
concentration in blood, height and bone density
Binary Classification
• Output classes of a binary classification can be represented by either
• Boolean values, True or False (or Positive or Negative)
• Numbers 1 or 0
• True or 1 value is used for the Positive Class for one class which is
generally the class we want to analyze
• False or 0 value is used for the Negative Class for the other class
• E.g.: For classifying a tumor as malignant (a cancer) or benign (not a
cancer) by the tumor size, being malignant can be taken as the
Positive class and the benign class as the Negative class
Binary Classification - Example
0 (Benign)
1 (Malignant)
X
Y
Binary Classification – with Linear Regression
0 (Benign)
1 (Malignant)
X
Y Linear
Regression
Classifier
0.5
Malignant
Benign
Binary Classification – Problem with LR
0 (Benign)
1 (Malignant)
X
Y
Linear
Regression
Classifier
0.5
Malignant
Benign
Misclassified
Binary Classification – Requirement
0 (Benign)
1 (Malignant)
X
Y
Linear
Regression
Classifier
0.5
Malignant
Benign
Required Regression Classifier
(Variant of Unit Step Function)
Binary Classification – Requirement
0 (Benign)
1 (Malignant)
X
Y
Linear
Regression
Classifier
0.5
Malignant
Benign
Not Differentiable here
for Gradient Descent
Required Regression Classifier
(Variant of Unit Step Function)
Binary Classification – Requirement
0 (Benign)
1 (Malignant)
X
Y
Linear
Regression
Classifier
0.5
Malignant
Benign
Continuous
Regression
Classifier
Logistic/Sigmoid Function
• Sigmoid function: 𝐟 𝐳 =
𝟏
𝟏+𝐞−𝐳
Z = 0 ⇒ f(Z) = 0.5
0 < f(Z) < 1
• A Non-linear function
• This is a continuous alternative
for the Unit Step Function
Z
f(Z)
Logistic Regression
Like Linear Regression say, Z = β0 + β1*X1 + β2*X2 + ... + βn*Xn
Logistic Function, f Z =
1
1+e−z
f X =
1
1 + e−(β0 + β1∗X1 + β2∗X2 + ... + βn∗Xn)
In vector form,
f X =
1
1 + e−βTX
where β0 = β0*X0 taking X0 = 1
This is the function of Logistic Regression.
Logistic Regression - Prediction
Let’s take predictions as f(X) = ቊ
1 (or Positive) if, f x ≥ 0.5
0 (or Negative) if, f x < 0.5
f(X) = ൞
Positive ⇒ f X ≥ 0.5 ⇒
1
1+e−βTX
≥ 0.5 ⇒ βTX ≥ 0
Negative ⇒ f X < 0.5 ⇒
1
1+e−βTX
< 0.5 ⇒ βTX < 0
Here, βTX = β0 + β1*X1 + β2*X2 + ... + βn*Xn
Prediction Example
Take a classification problem with 2 independent variables where,
f(X) =
1
1+e−(β0 + β1∗X1 + β2∗X2)
Negative
Positive
X2
Z = β0 + β1*X1 + β2*X2
(Decision boundary)
Z > 0
Positive
Z < 0
Negative
X1
Non-linear Classification
Taking polynomials of X values (as discussed in Polynomial Regression)
can classify non-linear data points with non-linear decision boundaries
E.g.:
f(X) =
1
1+e− (β0 + β1∗X1
2 + β2∗X2
2)
Negative Positive
X2
Z = β0 + β1∗X1
2 + β2∗X2
2
(Decision boundary)
Z > 0
Positive
Z < 0
Negative
X1
Binary Logistic Regression – Cost Function
Cost for a single data point is known as the Loss
Take the Loss Function of Logistic Regression as L{f(X)}
L f X , Y = ቊ
− log f(X) if Y = 1
− log 1 − f(X) if Y = 0
L f X , Y = −Y log f(X) −(1 − Y) log 1 − f(X)
Cost function: J(β) =
1
n
σ𝑖=1
n
L f x , Y
J(β) =
1
n
෌𝑖=1
n
[−Y log f(X) − (1 − Y) log 1 − f(X) ]
This Cost Function is Convex (has a Global Minimum)
Multiclass Logistic Regression
• Up to now we have looked at Binary Classification problems where
there can be only two outcomes/categories/classes as the Y variable
• When there are more than 2 classes available (only one of them is
positive for any given data point) the problem becomes a Multiclass
Classification problem
• One way to handle Multiclass Classification is using the Binary
Classifiers known as One-vs-All (OvA), also known as one-vs-rest
(OvR)
• It trains multiple binary classifiers, each one predicting the confidence
(probability) of one class against the rest, and the highest class is selected
Multiclass Logistic Regression
• OvA can be used
• When you want to use different binary classifiers (e.g., SVMs or logistic
regression) for each class
• When available memory is limited or need to highly parallelize
• There is another technique for Multiclass Logistic Regression by
simply generalizing the binary classification problem of the Logistic
Regression
• This General form of Classifier is known as the Softmax Classifier
• There, the Softmax Function is used instead of the Sigmoid function
when there are multiple classes
Softmax Function
• The name Softmax is used, as it is a continuous function
approximation to the Maximum Function, where only one class
(maximum) is allowed to be considered as Positive
• Softmax function is used instead of the Maximum Function to make
the function differentiable
• Softmax Function: S(Xi) =
𝐞𝐱𝐢
෍
𝐣=𝟏
𝐧
𝐞
𝐱𝐣
where i is any data point and j is the index of the dimension of the vector Xi
Softmax Function
• Softmax function exponentially highlights the value in the dimension
where the value is maximum, while suppressing all other dimensions
• Output values of a vector from a Softmax function sums to 1
• E.g.: Input Vector Output Vector
Softmax Function
Softmax Regression
• Like Z = βTX is the used for binary classification, Zk = βk
TX is used for
Multiclass classification, where k is the index of the class
• Note that there K number of β vectors exists as model parameters
• Like Y is used for binary classification where there is only a single
dependent variables, Multiclass classification has K dependent
variables, each denoted by Yk and its estimator ෡
𝐘𝐤
෡
𝐘𝐤 =
𝐞𝐙𝒌
෍
𝐣=𝟏
𝐊
𝐞
𝐙𝐣
Softmax Regression
Loss function:
L f X , Y = -log(෡
Yk) = -log(
eZ𝑘
෍
j=1
K
e
Zj
) = -log(
eβk
TX
෎
j=1
K
e
βj
TX
)
Cost function (Cross Entropy Loss):
J(β) = − ා
𝑖=1
N
Σk=1
K
I[Yi = k]log(
eβk
TX
෎
j=1
K
e
βj
TX
)
One Hour Homework
• Officially we have one more hour to do after the end of the lectures
• Therefore, for this week’s extra hour you have a homework
• Logistic Regression is the basic building block of Deep Neural Networks (DNN).
Softmax classifiers are used as it is in DNNs as the final classification layer
• Go through the slides and get a clear understanding on Logistic and Softmax
Regressions
• Refer external sources to clarify all the ambiguities related to it
• Good Luck!
Questions?

More Related Content

Similar to Lecture 6 - Logistic Regression, a lecture in subject module Statistical & Machine Learning (20)

Chapter3 hundred page machine learning
Chapter3 hundred page machine learningChapter3 hundred page machine learning
Chapter3 hundred page machine learning
mustafa sarac
 
Data classification sammer
Data classification sammer Data classification sammer
Data classification sammer
Sammer Qader
 
Logistic Regression Classifier - Conceptual Guide
Logistic Regression Classifier - Conceptual GuideLogistic Regression Classifier - Conceptual Guide
Logistic Regression Classifier - Conceptual Guide
Caglar Subasi
 
Logistic Regression in machine learning ppt
Logistic Regression in machine learning pptLogistic Regression in machine learning ppt
Logistic Regression in machine learning ppt
raminder12_kaur
 
Regresion logistica-modelo de clasificacion
Regresion logistica-modelo de clasificacionRegresion logistica-modelo de clasificacion
Regresion logistica-modelo de clasificacion
oswahernan2203
 
Classification Techniques for Machine Learning
Classification Techniques for Machine LearningClassification Techniques for Machine Learning
Classification Techniques for Machine Learning
rahuljain582793
 
Machine Learning with Python- Machine Learning Algorithms- Logistic Regressio...
Machine Learning with Python- Machine Learning Algorithms- Logistic Regressio...Machine Learning with Python- Machine Learning Algorithms- Logistic Regressio...
Machine Learning with Python- Machine Learning Algorithms- Logistic Regressio...
KalighatOkira
 
Logistic Regression power point presentation.pptx
Logistic Regression power point presentation.pptxLogistic Regression power point presentation.pptx
Logistic Regression power point presentation.pptx
harshasawa2003
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
홍배 김
 
CSE357 fa21 (6) Linear Machine Learning11-11.pdf
CSE357 fa21 (6) Linear Machine Learning11-11.pdfCSE357 fa21 (6) Linear Machine Learning11-11.pdf
CSE357 fa21 (6) Linear Machine Learning11-11.pdf
NermeenKamel7
 
5_LR_Apr_7_2021.pptx in nature language processing
5_LR_Apr_7_2021.pptx in nature language processing5_LR_Apr_7_2021.pptx in nature language processing
5_LR_Apr_7_2021.pptx in nature language processing
attaurahman
 
Ml3 logistic regression-and_classification_error_metrics
Ml3 logistic regression-and_classification_error_metricsMl3 logistic regression-and_classification_error_metrics
Ml3 logistic regression-and_classification_error_metrics
ankit_ppt
 
Review : Perceptron Artificial Intelligence.pdf
Review : Perceptron Artificial Intelligence.pdfReview : Perceptron Artificial Intelligence.pdf
Review : Perceptron Artificial Intelligence.pdf
willymuhammadfauzi1
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
YashwantGahlot1
 
ML MODULE 4.pdf
ML MODULE 4.pdfML MODULE 4.pdf
ML MODULE 4.pdf
Shiwani Gupta
 
6 logistic regression classification algo
6 logistic regression   classification algo6 logistic regression   classification algo
6 logistic regression classification algo
TanmayVijay1
 
Lecture 3.1_ Logistic Regression.pptx
Lecture 3.1_ Logistic Regression.pptxLecture 3.1_ Logistic Regression.pptx
Lecture 3.1_ Logistic Regression.pptx
ajondaree
 
15-Data Analytics in IoT - Supervised Learning-04-09-2024.pdf
15-Data Analytics in IoT - Supervised Learning-04-09-2024.pdf15-Data Analytics in IoT - Supervised Learning-04-09-2024.pdf
15-Data Analytics in IoT - Supervised Learning-04-09-2024.pdf
DharanshNeema
 
Cheatsheet supervised-learning
Cheatsheet supervised-learningCheatsheet supervised-learning
Cheatsheet supervised-learning
Steve Nouri
 
PRML Chapter 4
PRML Chapter 4PRML Chapter 4
PRML Chapter 4
Sunwoo Kim
 
Chapter3 hundred page machine learning
Chapter3 hundred page machine learningChapter3 hundred page machine learning
Chapter3 hundred page machine learning
mustafa sarac
 
Data classification sammer
Data classification sammer Data classification sammer
Data classification sammer
Sammer Qader
 
Logistic Regression Classifier - Conceptual Guide
Logistic Regression Classifier - Conceptual GuideLogistic Regression Classifier - Conceptual Guide
Logistic Regression Classifier - Conceptual Guide
Caglar Subasi
 
Logistic Regression in machine learning ppt
Logistic Regression in machine learning pptLogistic Regression in machine learning ppt
Logistic Regression in machine learning ppt
raminder12_kaur
 
Regresion logistica-modelo de clasificacion
Regresion logistica-modelo de clasificacionRegresion logistica-modelo de clasificacion
Regresion logistica-modelo de clasificacion
oswahernan2203
 
Classification Techniques for Machine Learning
Classification Techniques for Machine LearningClassification Techniques for Machine Learning
Classification Techniques for Machine Learning
rahuljain582793
 
Machine Learning with Python- Machine Learning Algorithms- Logistic Regressio...
Machine Learning with Python- Machine Learning Algorithms- Logistic Regressio...Machine Learning with Python- Machine Learning Algorithms- Logistic Regressio...
Machine Learning with Python- Machine Learning Algorithms- Logistic Regressio...
KalighatOkira
 
Logistic Regression power point presentation.pptx
Logistic Regression power point presentation.pptxLogistic Regression power point presentation.pptx
Logistic Regression power point presentation.pptx
harshasawa2003
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
홍배 김
 
CSE357 fa21 (6) Linear Machine Learning11-11.pdf
CSE357 fa21 (6) Linear Machine Learning11-11.pdfCSE357 fa21 (6) Linear Machine Learning11-11.pdf
CSE357 fa21 (6) Linear Machine Learning11-11.pdf
NermeenKamel7
 
5_LR_Apr_7_2021.pptx in nature language processing
5_LR_Apr_7_2021.pptx in nature language processing5_LR_Apr_7_2021.pptx in nature language processing
5_LR_Apr_7_2021.pptx in nature language processing
attaurahman
 
Ml3 logistic regression-and_classification_error_metrics
Ml3 logistic regression-and_classification_error_metricsMl3 logistic regression-and_classification_error_metrics
Ml3 logistic regression-and_classification_error_metrics
ankit_ppt
 
Review : Perceptron Artificial Intelligence.pdf
Review : Perceptron Artificial Intelligence.pdfReview : Perceptron Artificial Intelligence.pdf
Review : Perceptron Artificial Intelligence.pdf
willymuhammadfauzi1
 
6 logistic regression classification algo
6 logistic regression   classification algo6 logistic regression   classification algo
6 logistic regression classification algo
TanmayVijay1
 
Lecture 3.1_ Logistic Regression.pptx
Lecture 3.1_ Logistic Regression.pptxLecture 3.1_ Logistic Regression.pptx
Lecture 3.1_ Logistic Regression.pptx
ajondaree
 
15-Data Analytics in IoT - Supervised Learning-04-09-2024.pdf
15-Data Analytics in IoT - Supervised Learning-04-09-2024.pdf15-Data Analytics in IoT - Supervised Learning-04-09-2024.pdf
15-Data Analytics in IoT - Supervised Learning-04-09-2024.pdf
DharanshNeema
 
Cheatsheet supervised-learning
Cheatsheet supervised-learningCheatsheet supervised-learning
Cheatsheet supervised-learning
Steve Nouri
 
PRML Chapter 4
PRML Chapter 4PRML Chapter 4
PRML Chapter 4
Sunwoo Kim
 

More from Maninda Edirisooriya (20)

Lecture - 10 Transformer Model, Motivation to Transformers, Principles, and ...
Lecture - 10 Transformer Model, Motivation to Transformers, Principles,  and ...Lecture - 10 Transformer Model, Motivation to Transformers, Principles,  and ...
Lecture - 10 Transformer Model, Motivation to Transformers, Principles, and ...
Maninda Edirisooriya
 
Lecture 11 - Advance Learning Techniques
Lecture 11 - Advance Learning TechniquesLecture 11 - Advance Learning Techniques
Lecture 11 - Advance Learning Techniques
Maninda Edirisooriya
 
Lecture 9 - Deep Sequence Models, Learn Recurrent Neural Networks (RNN), GRU ...
Lecture 9 - Deep Sequence Models, Learn Recurrent Neural Networks (RNN), GRU ...Lecture 9 - Deep Sequence Models, Learn Recurrent Neural Networks (RNN), GRU ...
Lecture 9 - Deep Sequence Models, Learn Recurrent Neural Networks (RNN), GRU ...
Maninda Edirisooriya
 
Extra Lecture - Support Vector Machines (SVM), a lecture in subject module St...
Extra Lecture - Support Vector Machines (SVM), a lecture in subject module St...Extra Lecture - Support Vector Machines (SVM), a lecture in subject module St...
Extra Lecture - Support Vector Machines (SVM), a lecture in subject module St...
Maninda Edirisooriya
 
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Maninda Edirisooriya
 
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
Maninda Edirisooriya
 
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Maninda Edirisooriya
 
Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...
Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...
Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...
Maninda Edirisooriya
 
Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...
Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...
Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...
Maninda Edirisooriya
 
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Maninda Edirisooriya
 
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Maninda Edirisooriya
 
Lecture 3 - Exploratory Data Analytics (EDA), a lecture in subject module Sta...
Lecture 3 - Exploratory Data Analytics (EDA), a lecture in subject module Sta...Lecture 3 - Exploratory Data Analytics (EDA), a lecture in subject module Sta...
Lecture 3 - Exploratory Data Analytics (EDA), a lecture in subject module Sta...
Maninda Edirisooriya
 
Lecture 2 - Introduction to Machine Learning, a lecture in subject module Sta...
Lecture 2 - Introduction to Machine Learning, a lecture in subject module Sta...Lecture 2 - Introduction to Machine Learning, a lecture in subject module Sta...
Lecture 2 - Introduction to Machine Learning, a lecture in subject module Sta...
Maninda Edirisooriya
 
Analyzing the effectiveness of mobile and web channels using WSO2 BAM
Analyzing the effectiveness of mobile and web channels using WSO2 BAMAnalyzing the effectiveness of mobile and web channels using WSO2 BAM
Analyzing the effectiveness of mobile and web channels using WSO2 BAM
Maninda Edirisooriya
 
WSO2 BAM - Your big data toolbox
WSO2 BAM - Your big data toolboxWSO2 BAM - Your big data toolbox
WSO2 BAM - Your big data toolbox
Maninda Edirisooriya
 
Training Report
Training ReportTraining Report
Training Report
Maninda Edirisooriya
 
GViz - Project Report
GViz - Project ReportGViz - Project Report
GViz - Project Report
Maninda Edirisooriya
 
Mortivation
MortivationMortivation
Mortivation
Maninda Edirisooriya
 
Hafnium impact 2008
Hafnium impact 2008Hafnium impact 2008
Hafnium impact 2008
Maninda Edirisooriya
 
ChatCrypt
ChatCryptChatCrypt
ChatCrypt
Maninda Edirisooriya
 
Lecture - 10 Transformer Model, Motivation to Transformers, Principles, and ...
Lecture - 10 Transformer Model, Motivation to Transformers, Principles,  and ...Lecture - 10 Transformer Model, Motivation to Transformers, Principles,  and ...
Lecture - 10 Transformer Model, Motivation to Transformers, Principles, and ...
Maninda Edirisooriya
 
Lecture 11 - Advance Learning Techniques
Lecture 11 - Advance Learning TechniquesLecture 11 - Advance Learning Techniques
Lecture 11 - Advance Learning Techniques
Maninda Edirisooriya
 
Lecture 9 - Deep Sequence Models, Learn Recurrent Neural Networks (RNN), GRU ...
Lecture 9 - Deep Sequence Models, Learn Recurrent Neural Networks (RNN), GRU ...Lecture 9 - Deep Sequence Models, Learn Recurrent Neural Networks (RNN), GRU ...
Lecture 9 - Deep Sequence Models, Learn Recurrent Neural Networks (RNN), GRU ...
Maninda Edirisooriya
 
Extra Lecture - Support Vector Machines (SVM), a lecture in subject module St...
Extra Lecture - Support Vector Machines (SVM), a lecture in subject module St...Extra Lecture - Support Vector Machines (SVM), a lecture in subject module St...
Extra Lecture - Support Vector Machines (SVM), a lecture in subject module St...
Maninda Edirisooriya
 
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Maninda Edirisooriya
 
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
Maninda Edirisooriya
 
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Maninda Edirisooriya
 
Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...
Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...
Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...
Maninda Edirisooriya
 
Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...
Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...
Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...
Maninda Edirisooriya
 
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Maninda Edirisooriya
 
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Maninda Edirisooriya
 
Lecture 3 - Exploratory Data Analytics (EDA), a lecture in subject module Sta...
Lecture 3 - Exploratory Data Analytics (EDA), a lecture in subject module Sta...Lecture 3 - Exploratory Data Analytics (EDA), a lecture in subject module Sta...
Lecture 3 - Exploratory Data Analytics (EDA), a lecture in subject module Sta...
Maninda Edirisooriya
 
Lecture 2 - Introduction to Machine Learning, a lecture in subject module Sta...
Lecture 2 - Introduction to Machine Learning, a lecture in subject module Sta...Lecture 2 - Introduction to Machine Learning, a lecture in subject module Sta...
Lecture 2 - Introduction to Machine Learning, a lecture in subject module Sta...
Maninda Edirisooriya
 
Analyzing the effectiveness of mobile and web channels using WSO2 BAM
Analyzing the effectiveness of mobile and web channels using WSO2 BAMAnalyzing the effectiveness of mobile and web channels using WSO2 BAM
Analyzing the effectiveness of mobile and web channels using WSO2 BAM
Maninda Edirisooriya
 
Ad

Recently uploaded (20)

Digital Crime – Substantive Criminal Law – General Conditions – Offenses – In...
Digital Crime – Substantive Criminal Law – General Conditions – Offenses – In...Digital Crime – Substantive Criminal Law – General Conditions – Offenses – In...
Digital Crime – Substantive Criminal Law – General Conditions – Offenses – In...
ManiMaran230751
 
world subdivision.pdf...................
world subdivision.pdf...................world subdivision.pdf...................
world subdivision.pdf...................
bmmederos10
 
"The Enigmas of the Riemann Hypothesis" by Julio Chai
"The Enigmas of the Riemann Hypothesis" by Julio Chai"The Enigmas of the Riemann Hypothesis" by Julio Chai
"The Enigmas of the Riemann Hypothesis" by Julio Chai
Julio Chai
 
Video Games and Artificial-Realities.pptx
Video Games and Artificial-Realities.pptxVideo Games and Artificial-Realities.pptx
Video Games and Artificial-Realities.pptx
HadiBadri1
 
Structural Health and Factors affecting.pptx
Structural Health and Factors affecting.pptxStructural Health and Factors affecting.pptx
Structural Health and Factors affecting.pptx
gunjalsachin
 
Tesia Dobrydnia - A Leader In Her Industry
Tesia Dobrydnia - A Leader In Her IndustryTesia Dobrydnia - A Leader In Her Industry
Tesia Dobrydnia - A Leader In Her Industry
Tesia Dobrydnia
 
Application Security and Secure Software Development Lifecycle
Application  Security and Secure Software Development LifecycleApplication  Security and Secure Software Development Lifecycle
Application Security and Secure Software Development Lifecycle
DrKavithaP1
 
May 2025: Top 10 Cited Articles in Software Engineering & Applications Intern...
May 2025: Top 10 Cited Articles in Software Engineering & Applications Intern...May 2025: Top 10 Cited Articles in Software Engineering & Applications Intern...
May 2025: Top 10 Cited Articles in Software Engineering & Applications Intern...
sebastianku31
 
Introduction of Structural Audit and Health Montoring.pptx
Introduction of Structural Audit and Health Montoring.pptxIntroduction of Structural Audit and Health Montoring.pptx
Introduction of Structural Audit and Health Montoring.pptx
gunjalsachin
 
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning ModelEnhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
IRJET Journal
 
What is dbms architecture, components of dbms architecture and types of dbms ...
What is dbms architecture, components of dbms architecture and types of dbms ...What is dbms architecture, components of dbms architecture and types of dbms ...
What is dbms architecture, components of dbms architecture and types of dbms ...
cyhuutjdoazdwrnubt
 
MODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDING
MODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDINGMODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDING
MODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDING
Dr. BASWESHWAR JIRWANKAR
 
Influence line diagram in a robust model
Influence line diagram in a robust modelInfluence line diagram in a robust model
Influence line diagram in a robust model
ParthaSengupta26
 
Highway Engineering - Pavement materials
Highway Engineering - Pavement materialsHighway Engineering - Pavement materials
Highway Engineering - Pavement materials
AmrutaBhosale9
 
[HIFLUX] Lok Fitting&Valve Catalog 2025 (Eng)
[HIFLUX] Lok Fitting&Valve Catalog 2025 (Eng)[HIFLUX] Lok Fitting&Valve Catalog 2025 (Eng)
[HIFLUX] Lok Fitting&Valve Catalog 2025 (Eng)
하이플럭스 / HIFLUX Co., Ltd.
 
UNIT-1-PPT-Introduction about Power System Operation and Control
UNIT-1-PPT-Introduction about Power System Operation and ControlUNIT-1-PPT-Introduction about Power System Operation and Control
UNIT-1-PPT-Introduction about Power System Operation and Control
Sridhar191373
 
ISO 4548-7 Filter Vibration Fatigue Test Rig Catalogue.pdf
ISO 4548-7 Filter Vibration Fatigue Test Rig Catalogue.pdfISO 4548-7 Filter Vibration Fatigue Test Rig Catalogue.pdf
ISO 4548-7 Filter Vibration Fatigue Test Rig Catalogue.pdf
FILTRATION ENGINEERING & CUNSULTANT
 
May 2025 - Top 10 Read Articles in Artificial Intelligence and Applications (...
May 2025 - Top 10 Read Articles in Artificial Intelligence and Applications (...May 2025 - Top 10 Read Articles in Artificial Intelligence and Applications (...
May 2025 - Top 10 Read Articles in Artificial Intelligence and Applications (...
gerogepatton
 
9aeb2aae-3b85-47a5-9776-154883bbae57.pdf
9aeb2aae-3b85-47a5-9776-154883bbae57.pdf9aeb2aae-3b85-47a5-9776-154883bbae57.pdf
9aeb2aae-3b85-47a5-9776-154883bbae57.pdf
RishabhGupta578788
 
Digital Crime – Substantive Criminal Law – General Conditions – Offenses – In...
Digital Crime – Substantive Criminal Law – General Conditions – Offenses – In...Digital Crime – Substantive Criminal Law – General Conditions – Offenses – In...
Digital Crime – Substantive Criminal Law – General Conditions – Offenses – In...
ManiMaran230751
 
world subdivision.pdf...................
world subdivision.pdf...................world subdivision.pdf...................
world subdivision.pdf...................
bmmederos10
 
"The Enigmas of the Riemann Hypothesis" by Julio Chai
"The Enigmas of the Riemann Hypothesis" by Julio Chai"The Enigmas of the Riemann Hypothesis" by Julio Chai
"The Enigmas of the Riemann Hypothesis" by Julio Chai
Julio Chai
 
Video Games and Artificial-Realities.pptx
Video Games and Artificial-Realities.pptxVideo Games and Artificial-Realities.pptx
Video Games and Artificial-Realities.pptx
HadiBadri1
 
Structural Health and Factors affecting.pptx
Structural Health and Factors affecting.pptxStructural Health and Factors affecting.pptx
Structural Health and Factors affecting.pptx
gunjalsachin
 
Tesia Dobrydnia - A Leader In Her Industry
Tesia Dobrydnia - A Leader In Her IndustryTesia Dobrydnia - A Leader In Her Industry
Tesia Dobrydnia - A Leader In Her Industry
Tesia Dobrydnia
 
Application Security and Secure Software Development Lifecycle
Application  Security and Secure Software Development LifecycleApplication  Security and Secure Software Development Lifecycle
Application Security and Secure Software Development Lifecycle
DrKavithaP1
 
May 2025: Top 10 Cited Articles in Software Engineering & Applications Intern...
May 2025: Top 10 Cited Articles in Software Engineering & Applications Intern...May 2025: Top 10 Cited Articles in Software Engineering & Applications Intern...
May 2025: Top 10 Cited Articles in Software Engineering & Applications Intern...
sebastianku31
 
Introduction of Structural Audit and Health Montoring.pptx
Introduction of Structural Audit and Health Montoring.pptxIntroduction of Structural Audit and Health Montoring.pptx
Introduction of Structural Audit and Health Montoring.pptx
gunjalsachin
 
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning ModelEnhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
IRJET Journal
 
What is dbms architecture, components of dbms architecture and types of dbms ...
What is dbms architecture, components of dbms architecture and types of dbms ...What is dbms architecture, components of dbms architecture and types of dbms ...
What is dbms architecture, components of dbms architecture and types of dbms ...
cyhuutjdoazdwrnubt
 
MODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDING
MODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDINGMODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDING
MODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDING
Dr. BASWESHWAR JIRWANKAR
 
Influence line diagram in a robust model
Influence line diagram in a robust modelInfluence line diagram in a robust model
Influence line diagram in a robust model
ParthaSengupta26
 
Highway Engineering - Pavement materials
Highway Engineering - Pavement materialsHighway Engineering - Pavement materials
Highway Engineering - Pavement materials
AmrutaBhosale9
 
UNIT-1-PPT-Introduction about Power System Operation and Control
UNIT-1-PPT-Introduction about Power System Operation and ControlUNIT-1-PPT-Introduction about Power System Operation and Control
UNIT-1-PPT-Introduction about Power System Operation and Control
Sridhar191373
 
May 2025 - Top 10 Read Articles in Artificial Intelligence and Applications (...
May 2025 - Top 10 Read Articles in Artificial Intelligence and Applications (...May 2025 - Top 10 Read Articles in Artificial Intelligence and Applications (...
May 2025 - Top 10 Read Articles in Artificial Intelligence and Applications (...
gerogepatton
 
9aeb2aae-3b85-47a5-9776-154883bbae57.pdf
9aeb2aae-3b85-47a5-9776-154883bbae57.pdf9aeb2aae-3b85-47a5-9776-154883bbae57.pdf
9aeb2aae-3b85-47a5-9776-154883bbae57.pdf
RishabhGupta578788
 
Ad

Lecture 6 - Logistic Regression, a lecture in subject module Statistical & Machine Learning

  • 1. DA 5230 – Statistical & Machine Learning Lecture 5 – Logistic Regression Maninda Edirisooriya [email protected]
  • 2. Classification • When the Y variable of a Supervised Learning problem is of several discreate classes (e.g.: Color, Age groups) the problem is known as a Classification problem • A Classification problem has to predict/select a certain Category (or a Class) as the dependent variable • When there are only 2 classes to be classified, it is known as a Binary Classification problem E.g.: Predicting a person’s gender (either as male or female) by testosterone concentration in blood, height and bone density
  • 3. Binary Classification • Output classes of a binary classification can be represented by either • Boolean values, True or False (or Positive or Negative) • Numbers 1 or 0 • True or 1 value is used for the Positive Class for one class which is generally the class we want to analyze • False or 0 value is used for the Negative Class for the other class • E.g.: For classifying a tumor as malignant (a cancer) or benign (not a cancer) by the tumor size, being malignant can be taken as the Positive class and the benign class as the Negative class
  • 4. Binary Classification - Example 0 (Benign) 1 (Malignant) X Y
  • 5. Binary Classification – with Linear Regression 0 (Benign) 1 (Malignant) X Y Linear Regression Classifier 0.5 Malignant Benign
  • 6. Binary Classification – Problem with LR 0 (Benign) 1 (Malignant) X Y Linear Regression Classifier 0.5 Malignant Benign Misclassified
  • 7. Binary Classification – Requirement 0 (Benign) 1 (Malignant) X Y Linear Regression Classifier 0.5 Malignant Benign Required Regression Classifier (Variant of Unit Step Function)
  • 8. Binary Classification – Requirement 0 (Benign) 1 (Malignant) X Y Linear Regression Classifier 0.5 Malignant Benign Not Differentiable here for Gradient Descent Required Regression Classifier (Variant of Unit Step Function)
  • 9. Binary Classification – Requirement 0 (Benign) 1 (Malignant) X Y Linear Regression Classifier 0.5 Malignant Benign Continuous Regression Classifier
  • 10. Logistic/Sigmoid Function • Sigmoid function: 𝐟 𝐳 = 𝟏 𝟏+𝐞−𝐳 Z = 0 ⇒ f(Z) = 0.5 0 < f(Z) < 1 • A Non-linear function • This is a continuous alternative for the Unit Step Function Z f(Z)
  • 11. Logistic Regression Like Linear Regression say, Z = β0 + β1*X1 + β2*X2 + ... + βn*Xn Logistic Function, f Z = 1 1+e−z f X = 1 1 + e−(β0 + β1∗X1 + β2∗X2 + ... + βn∗Xn) In vector form, f X = 1 1 + e−βTX where β0 = β0*X0 taking X0 = 1 This is the function of Logistic Regression.
  • 12. Logistic Regression - Prediction Let’s take predictions as f(X) = ቊ 1 (or Positive) if, f x ≥ 0.5 0 (or Negative) if, f x < 0.5 f(X) = ൞ Positive ⇒ f X ≥ 0.5 ⇒ 1 1+e−βTX ≥ 0.5 ⇒ βTX ≥ 0 Negative ⇒ f X < 0.5 ⇒ 1 1+e−βTX < 0.5 ⇒ βTX < 0 Here, βTX = β0 + β1*X1 + β2*X2 + ... + βn*Xn
  • 13. Prediction Example Take a classification problem with 2 independent variables where, f(X) = 1 1+e−(β0 + β1∗X1 + β2∗X2) Negative Positive X2 Z = β0 + β1*X1 + β2*X2 (Decision boundary) Z > 0 Positive Z < 0 Negative X1
  • 14. Non-linear Classification Taking polynomials of X values (as discussed in Polynomial Regression) can classify non-linear data points with non-linear decision boundaries E.g.: f(X) = 1 1+e− (β0 + β1∗X1 2 + β2∗X2 2) Negative Positive X2 Z = β0 + β1∗X1 2 + β2∗X2 2 (Decision boundary) Z > 0 Positive Z < 0 Negative X1
  • 15. Binary Logistic Regression – Cost Function Cost for a single data point is known as the Loss Take the Loss Function of Logistic Regression as L{f(X)} L f X , Y = ቊ − log f(X) if Y = 1 − log 1 − f(X) if Y = 0 L f X , Y = −Y log f(X) −(1 − Y) log 1 − f(X) Cost function: J(β) = 1 n σ𝑖=1 n L f x , Y J(β) = 1 n ෌𝑖=1 n [−Y log f(X) − (1 − Y) log 1 − f(X) ] This Cost Function is Convex (has a Global Minimum)
  • 16. Multiclass Logistic Regression • Up to now we have looked at Binary Classification problems where there can be only two outcomes/categories/classes as the Y variable • When there are more than 2 classes available (only one of them is positive for any given data point) the problem becomes a Multiclass Classification problem • One way to handle Multiclass Classification is using the Binary Classifiers known as One-vs-All (OvA), also known as one-vs-rest (OvR) • It trains multiple binary classifiers, each one predicting the confidence (probability) of one class against the rest, and the highest class is selected
  • 17. Multiclass Logistic Regression • OvA can be used • When you want to use different binary classifiers (e.g., SVMs or logistic regression) for each class • When available memory is limited or need to highly parallelize • There is another technique for Multiclass Logistic Regression by simply generalizing the binary classification problem of the Logistic Regression • This General form of Classifier is known as the Softmax Classifier • There, the Softmax Function is used instead of the Sigmoid function when there are multiple classes
  • 18. Softmax Function • The name Softmax is used, as it is a continuous function approximation to the Maximum Function, where only one class (maximum) is allowed to be considered as Positive • Softmax function is used instead of the Maximum Function to make the function differentiable • Softmax Function: S(Xi) = 𝐞𝐱𝐢 ෍ 𝐣=𝟏 𝐧 𝐞 𝐱𝐣 where i is any data point and j is the index of the dimension of the vector Xi
  • 19. Softmax Function • Softmax function exponentially highlights the value in the dimension where the value is maximum, while suppressing all other dimensions • Output values of a vector from a Softmax function sums to 1 • E.g.: Input Vector Output Vector Softmax Function
  • 20. Softmax Regression • Like Z = βTX is the used for binary classification, Zk = βk TX is used for Multiclass classification, where k is the index of the class • Note that there K number of β vectors exists as model parameters • Like Y is used for binary classification where there is only a single dependent variables, Multiclass classification has K dependent variables, each denoted by Yk and its estimator ෡ 𝐘𝐤 ෡ 𝐘𝐤 = 𝐞𝐙𝒌 ෍ 𝐣=𝟏 𝐊 𝐞 𝐙𝐣
  • 21. Softmax Regression Loss function: L f X , Y = -log(෡ Yk) = -log( eZ𝑘 ෍ j=1 K e Zj ) = -log( eβk TX ෎ j=1 K e βj TX ) Cost function (Cross Entropy Loss): J(β) = − ා 𝑖=1 N Σk=1 K I[Yi = k]log( eβk TX ෎ j=1 K e βj TX )
  • 22. One Hour Homework • Officially we have one more hour to do after the end of the lectures • Therefore, for this week’s extra hour you have a homework • Logistic Regression is the basic building block of Deep Neural Networks (DNN). Softmax classifiers are used as it is in DNNs as the final classification layer • Go through the slides and get a clear understanding on Logistic and Softmax Regressions • Refer external sources to clarify all the ambiguities related to it • Good Luck!