0% found this document useful (0 votes)

4 views

06LogisticRegression

Uploaded by

Mehar Hassan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

06LogisticRegression

Uploaded by

Mehar Hassan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 55

Introduction to

Machine Learning
Dr. Muhammad Amjad Iqbal
Associate Professor
University of Central Punjab, Lahore.
[email protected]

https://sites.google.com/a/ucp.edu.pk/mai/iml/
Slides of Prof. Dr. Andrew Ng, Stanford & Dr. Humayoun
Logistic Regression
A Classification Algorithm

One of the most popular and most widely

used learning algorithm today

2
Classification

Email: Spam / Not Spam?

Online Transactions: Fraudulent (Yes / No)?
Tumor: Malignant / Benign ?

0: “Negative Class” (e.g., benign tumor)

1: “Positive Class” (e.g., malignant tumor)
0: “Negative Class” (e.g., benign tumor)
𝑦 ∈ {0 ,1 , 2 , 3 } 1: “Positive Class 1” (e.g., type 1 tumor)
2: “Positive Class 2” (e.g., type 2 tumor)
…
3
Threshold classifier output at 0.5:
If , predict “y = 1”
If , predict “y = 0”
4
Threshold classifier output at 0.5:
If , predict “y = 1”
If , predict “y = 0”
5
Bad thing to do for linear regression

Before we just got lucky!

6
Classification: y = 0 or 1

can be > 1 or < 0

Logistic Regression:

Classification task

7
Hypothesis Representation

8
Logistic Regression Model
Want

0.5

Sigmoid function 0

Logistic function
Need to select parameters so that this line fits data
Do it with an algorithm later 9
Interpretation of Hypothesis Output
= estimated probability that y = 1 on a new input x

Example: If

Tell patient that 70% chance of tumor being malignant

h 𝜃 ( 𝑥 ) =𝑃 ( 𝑦=1| 𝑥 ; 𝜃 ¿ “probability that y = 1, given x,

parameterized by ”
𝑦 =0 𝑜𝑟 1
10
Decision boundary
Logistic regression or 𝒉𝜽 ( 𝒙)=¿

𝜽 𝑻 𝒙=¿
If
`
If 𝑧 is + ve then 𝑔 ( 𝑧 ) ≥ 0.5i . e . 𝜃 𝑇 𝑥 ≥ 0
If
If 𝑇
i . e . 𝜃 𝑥< 0
12
For any example with features x1, x2 that satisfy this equation predicts y=1
Decision Boundary
x2
3
2

1 2 3 x1

Predict “ “ if

15
Non-linear decision boundaries
x2

-1 1 x1
-1
Predict “ “ if
x2

16
Cost function
To fit the parameters
Training set:

m examples

How to choose parameters ?

18
Cost function
_____ regression:
Linear
Logistic 1
𝑇
𝑥
1+𝑒 −𝜃

“non-convex function” “convex function”

19
Logistic regression cost function

If y = 1

Cost

0 1 20
Logistic regression cost function

If y = 1

Cost

0 1 21
Logistic regression cost function

If y = 0

Cost

0 1 22
Logistic regression cost function

If y = 0 Cost = 0 if ,
But as

Cost Captures the intuition that if

(predict , but ,
We penalize learning algorithm by very
large cost
0 1 23
Simplified cost function and gradient
descent
Logistic regression cost function

𝐶𝑜𝑠𝑡 ( h 𝜃 ( 𝑥 ) , 𝑦 ) =− 𝑦 log ( h𝜃 ( 𝑥 ) ) − ( 1 − 𝑦 ) log ⁡(1 −h 𝜃 ( 𝑥 ) )

𝐼𝑓 𝑦=1 : 𝐶𝑜𝑠𝑡 ( h 𝜃 ( 𝑥 ) , 𝑦 ) =− 𝑙𝑜𝑔 h𝜃 ( 𝑥)

𝐼𝑓 𝑦=0 :𝐶𝑜𝑠𝑡 ( h𝜃 ( 𝑥 ) , 𝑦 ) =−𝑙𝑜𝑔 (1− h¿¿ 𝜃(𝑥 ))¿
25
Logistic regression cost function
Why do we chose this function when other cost
functions exist?
• This cost function can be derived from statistics
using the principle of maximum likelihood
estimation
– An efficient method to find parameters in data for
different models
– It is a convex function
Logistic regression cost function

To fit parameters :

Hypothesis estimating the

probability that y=1
To make a prediction given new :
Output
27
Gradient Descent

Want :
Repeat

(simultaneously update all )

𝒎
𝝏 𝟏
𝑱 ( 𝜽 )= ∑ ( 𝒉 𝜽 𝒙 − 𝒚 ) 𝒙 𝒋
( (𝒊 )
) ( 𝒊) (𝒊 )
𝝏𝜽 𝒎 𝒊=𝟏
28
Gradient Descent

Want :
Repeat

(simultaneously update all )

Algorithm looks identical to linear regression! Difference

But actually they are very different from each other 29
1
Hypothesis h𝜃 ( 𝑥) = −𝜃 𝑥
𝑇

1+ 𝑒
Cost function
1
𝐽 ( 𝜃 )= ¿
𝑚

Gradient Descent

𝒎
𝟏
𝜃 𝑗 =𝜃 𝑗 − 𝛼 ∑ (𝒉 𝜽 ( 𝒙 ) − 𝒚 ) 𝒙 𝒋
( 𝒊) (𝒊 ) (𝒊)
𝒎 𝒊=𝟏
Multi-class classification
One-vs-all algorithm
Multiclass classification
Email foldering/tagging: Work, Friends, Family, Hobby

Medical diagrams: Not ill, Cold, Flu

Weather: Sunny, Cloudy, Rain, Snow

37
Binary classification: Multi-class classification:

x2 x2

x1 x1
38
One-vs-all (one-vs-rest):

x1
Class 1:
Class 2:
Class 3:
39
One-vs-all

Train a logistic regression classifier for each

class to predict the probability that .

On a new input , to make a prediction, pick the

class that maximizes

40
Regularization

41
The problem of overfitting
• So far we've seen a few algorithms
• Work well for many applications, but can suffer from
the problem of overfitting

42
Overfitting with linear regression
Example: Linear regression (housing prices)
Price

Price

Price
Size Size Size

Overfitting: If we have too many features, the learned hypothesis

may fit the training set very well ( ), but fail
to generalize to new examples (predict prices on new examples).
The hypothesis is just too large, too variable and we don't have enough data to
constrain it to give us a good hypothesis 43
Example: Logistic regression

x2 x2 x2

x1 x1 x1

( = sigmoid function)
Addressing overfitting:
size of house

Price
no. of bedrooms
no. of floors
age of house
average income in neighborhood Size
kitchen size
• Plotting hypothesis is one way to decide whether
overfitting occurs or not
• But with lots of features and little data we cannot
visualize, and therefore:
• Hard to select the degree of polynomial
• What features to keep and which to drop
Addressing overfitting:

Options:
1. Reduce number of features. (but this means loosing
information)
― Manually select which features to keep.
― Model selection algorithm (later in course).
2. Regularization.
― Keep all the features, but reduce magnitude/values of
parameters .
― Works well when we have a lot of features, each of
which contributes a bit to predicting .
Cost function

47
Intuition

Price
Price

Size of house Size of house

Suppose we penalize and make , really small.

Regularization.
Small values for parameters
― “Simpler” hypothesis
― Less prone to overfitting
Housing:
Unlike the polynomial
― Features: example, we don't know what
― Parameters: are the high order terms
How do we pick the ones that need to be shrunk?
With regularization, take cost function and modify it to shrink all the parameters

By convention you don't penalize θ0 - minimization is from θ1 onwards

Regularization.

• Using the regularized objective

(i.e. cost function with

Price
regularization term)
• We get a much smoother curve
which fits the data and gives a
much better hypothesis
Size of house
λ is the regularization parameter
Controls a trade off between our two goals
1) Want to fit the training set well
2) Want to keep parameters small
In regularized linear regression, we choose to minimize

What if is set to an extremely large value (perhaps too large for

our problem, say )?
- Algorithm works fine; setting to be very large can’t hurt it
- Algorithm fails to eliminate overfitting.
- Algorithm results in underfitting. (Fails to fit even training data
well).
- Gradient descent will fail to converge.
In regularized linear regression, we choose to minimize

What if is set to an extremely large value (perhaps for too large

for our problem, say )?
Price

Size of house
Regularized linear regression

54
Regularized linear regression
Gradient descent 𝜕
𝐽 (𝜃)
𝜕𝜃0
Repeat

[ + 𝝀
𝒎
𝜽 𝒋 ]
(regularized)

Same as before
Interesting term:
Usually learning rate is small and m is large Ex.
Regularized logistic regression

57
Regularized logistic regression.

x1
Cost function:
Gradient descent
Repeat

[ + 𝝀
𝒎
𝜽 𝒋 ]
(regularized)
End

Introduction To Machine Learning: Dr. Muhammad Amjad Iqbal
No ratings yet
Introduction To Machine Learning: Dr. Muhammad Amjad Iqbal
20 pages
3 Logistic Regression and Regularization
No ratings yet
3 Logistic Regression and Regularization
42 pages
AC-ED L04 - Logistic Regression, Regularization
No ratings yet
AC-ED L04 - Logistic Regression, Regularization
80 pages
01B-DL2023-LinearModels
No ratings yet
01B-DL2023-LinearModels
47 pages
Logistic Regression
No ratings yet
Logistic Regression
32 pages
ML 03 Logistic Regression
No ratings yet
ML 03 Logistic Regression
32 pages
06 Logistic Regression PDF
No ratings yet
06 Logistic Regression PDF
10 pages
Lec1 PDF
No ratings yet
Lec1 PDF
56 pages
Algorithms Notes
No ratings yet
Algorithms Notes
66 pages
Ch2Regression and Regularization1
No ratings yet
Ch2Regression and Regularization1
45 pages
Logistic Regression
No ratings yet
Logistic Regression
74 pages
A Layman's Guide to the Project
No ratings yet
A Layman's Guide to the Project
34 pages
Slide 2
No ratings yet
Slide 2
30 pages
Lecture 7 - Part A - Mutli Class and Overfitting and Regularization
No ratings yet
Lecture 7 - Part A - Mutli Class and Overfitting and Regularization
43 pages
Lecture 3. Classification
No ratings yet
Lecture 3. Classification
60 pages
Logistic Regression
No ratings yet
Logistic Regression
37 pages
Tmi05 2 Logistic Regression
No ratings yet
Tmi05 2 Logistic Regression
29 pages
Introduction To Machine Learning: The Problem of Overfitting
No ratings yet
Introduction To Machine Learning: The Problem of Overfitting
8 pages
Lecture Note #9_PEC-CS701E
No ratings yet
Lecture Note #9_PEC-CS701E
41 pages
M02Logistic Regression Logistic RegressioLogistic Regressionn
No ratings yet
M02Logistic Regression Logistic RegressioLogistic Regressionn
19 pages
Logistic Regression: Gunjan Bharadwaj Assistant Professor Dept of CEA
100% (1)
Logistic Regression: Gunjan Bharadwaj Assistant Professor Dept of CEA
42 pages
A Tutorial of Machine Learning
No ratings yet
A Tutorial of Machine Learning
16 pages
Lecture 21 - Logistic Regression
No ratings yet
Lecture 21 - Logistic Regression
34 pages
Week 3 Lecture Notes
No ratings yet
Week 3 Lecture Notes
7 pages
Logistic Regression
No ratings yet
Logistic Regression
9 pages
Chapter 2 - Logistic Regression
No ratings yet
Chapter 2 - Logistic Regression
88 pages
Logistic Regression
No ratings yet
Logistic Regression
24 pages
Logistic Regression
No ratings yet
Logistic Regression
34 pages
04- Linear-Classification-2024
No ratings yet
04- Linear-Classification-2024
65 pages
Logistic Regression
No ratings yet
Logistic Regression
21 pages
Logistic Regression
No ratings yet
Logistic Regression
8 pages
Sample Research Paper
No ratings yet
Sample Research Paper
26 pages
Machine Learning 2
No ratings yet
Machine Learning 2
19 pages
Classification-Introduction, Logistic Regression
No ratings yet
Classification-Introduction, Logistic Regression
26 pages
Andrew NG
No ratings yet
Andrew NG
31 pages
Docs Slides Lecture6
No ratings yet
Docs Slides Lecture6
31 pages
Logistic Regression_byimran
No ratings yet
Logistic Regression_byimran
35 pages
Lecture3 Logistic Regression Classifier V0
No ratings yet
Lecture3 Logistic Regression Classifier V0
41 pages
ML4 Linear Models
No ratings yet
ML4 Linear Models
34 pages
Logistic Regression
No ratings yet
Logistic Regression
42 pages
CH3 Logistic Regression 2020
No ratings yet
CH3 Logistic Regression 2020
28 pages
Logistic Regression[2]
No ratings yet
Logistic Regression[2]
36 pages
Logistic Regression: Classification
No ratings yet
Logistic Regression: Classification
28 pages
DSCTP 2022 1 ML Slides
No ratings yet
DSCTP 2022 1 ML Slides
110 pages
Lecture W3
No ratings yet
Lecture W3
28 pages
[MLP] MidtermNote
No ratings yet
[MLP] MidtermNote
31 pages
3-LG_Eval
No ratings yet
3-LG_Eval
52 pages
Chapter02-Introduction-to-DeepLearning
No ratings yet
Chapter02-Introduction-to-DeepLearning
84 pages
Lecture 03 Logistic Regression
No ratings yet
Lecture 03 Logistic Regression
34 pages
10. Binary Logistic Regression 2
No ratings yet
10. Binary Logistic Regression 2
43 pages
CH 4
No ratings yet
CH 4
41 pages
Lecture 8 Logistic Regression
No ratings yet
Lecture 8 Logistic Regression
34 pages
Linear Regression Python Programming
No ratings yet
Linear Regression Python Programming
25 pages
MLA TAB Lecture3
No ratings yet
MLA TAB Lecture3
70 pages
ML DSBA Lab2
No ratings yet
ML DSBA Lab2
4 pages
Logistic Regression
No ratings yet
Logistic Regression
6 pages
DSCTP 2022 1 ML Slides
No ratings yet
DSCTP 2022 1 ML Slides
351 pages
09_23ECE216_LogisticRegression
No ratings yet
09_23ECE216_LogisticRegression
40 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Exercises of Logarithms and Exponentials
From Everand
Exercises of Logarithms and Exponentials
Simone Malacrida
No ratings yet
9NeuralNetworksLearning
No ratings yet
9NeuralNetworksLearning
38 pages
Lecture 17 18 System Design Decomposition Archi Styles
No ratings yet
Lecture 17 18 System Design Decomposition Archi Styles
54 pages
Designing The Modules: Shari L. Pfleeger Joanne M. Atlee 4 Edition
No ratings yet
Designing The Modules: Shari L. Pfleeger Joanne M. Atlee 4 Edition
101 pages
Designing The Modules: Shari L. Pfleeger Joanne M. Atlee 4 Edition
No ratings yet
Designing The Modules: Shari L. Pfleeger Joanne M. Atlee 4 Edition
108 pages
NNDL Lab
No ratings yet
NNDL Lab
33 pages
Convolutional Neural Networks (CNN)
No ratings yet
Convolutional Neural Networks (CNN)
7 pages
PowerPoint Presentation-3
No ratings yet
PowerPoint Presentation-3
28 pages
Terminology - What Is The Difference Between MLP and RBF - Cross Validated
No ratings yet
Terminology - What Is The Difference Between MLP and RBF - Cross Validated
2 pages
Andrew Rosenberg - Lecture 14: Neural Networks
No ratings yet
Andrew Rosenberg - Lecture 14: Neural Networks
50 pages
Model Questions DWT
No ratings yet
Model Questions DWT
3 pages
UNIT 3 - Part - 2
No ratings yet
UNIT 3 - Part - 2
43 pages
MACHINE LEARNING
No ratings yet
MACHINE LEARNING
2 pages
Classification
No ratings yet
Classification
10 pages
A Neural Network Approach To Ordinal Regression: Jianlin Cheng, Zheng Wang, and Gianluca Pollastri
No ratings yet
A Neural Network Approach To Ordinal Regression: Jianlin Cheng, Zheng Wang, and Gianluca Pollastri
6 pages
Machine Learning-Based Real-Time Sensor Drift Fault Detection Using Raspberry Pi
No ratings yet
Machine Learning-Based Real-Time Sensor Drift Fault Detection Using Raspberry Pi
7 pages
Backpropagation Algorithm
No ratings yet
Backpropagation Algorithm
3 pages
Classification - KNN
No ratings yet
Classification - KNN
8 pages
NN Ch04
No ratings yet
NN Ch04
29 pages
CLASSIFICATION D'IMAGES POUR LA DÉTECTION DU CSSVD DANS LES PLANTS DE CACAO
No ratings yet
CLASSIFICATION D'IMAGES POUR LA DÉTECTION DU CSSVD DANS LES PLANTS DE CACAO
5 pages
Crosstabs: (Dataset1) C:/Users/Dell/Downloads/Ipeh - Sav
No ratings yet
Crosstabs: (Dataset1) C:/Users/Dell/Downloads/Ipeh - Sav
3 pages
A Hybrid Deep Learning Model For Consumer Credit Scoring: Bing Zhu, Wenchuan Yang, Huaxuan Wang, Yuan Yuan
No ratings yet
A Hybrid Deep Learning Model For Consumer Credit Scoring: Bing Zhu, Wenchuan Yang, Huaxuan Wang, Yuan Yuan
4 pages
DL QUESTION BANK
No ratings yet
DL QUESTION BANK
5 pages
Fuzzy Extreme Learning Machine For Classification: W.B. Zhang and H.B. Ji
No ratings yet
Fuzzy Extreme Learning Machine For Classification: W.B. Zhang and H.B. Ji
2 pages
DL2_Perceptron.pptx
No ratings yet
DL2_Perceptron.pptx
14 pages
FALLSEM2024-25 BCSE332L TH VL2024250101754 2024-07-29 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE332L TH VL2024250101754 2024-07-29 Reference-Material-I
85 pages
Nueral Network Mcqs
No ratings yet
Nueral Network Mcqs
6 pages
COMP3308/3608 Artificial Intelligence Week 9 Tutorial Exercises Multilayer Neural Networks 2. Deep Learning
No ratings yet
COMP3308/3608 Artificial Intelligence Week 9 Tutorial Exercises Multilayer Neural Networks 2. Deep Learning
2 pages
Nria20-Dl - Unit-4 Notes-Final
No ratings yet
Nria20-Dl - Unit-4 Notes-Final
21 pages
Unec 1700728516
No ratings yet
Unec 1700728516
105 pages
lec6 (1)
No ratings yet
lec6 (1)
18 pages
Data Driven Artificial Neural Network LSTM Hybrid 250129 102818-1
No ratings yet
Data Driven Artificial Neural Network LSTM Hybrid 250129 102818-1
6 pages
Tushar ML
No ratings yet
Tushar ML
52 pages
DEEP LEARNING (Previous Question Papers)
No ratings yet
DEEP LEARNING (Previous Question Papers)
3 pages
Teme Pentru Referate La Cursul "Retele Neuronale"
No ratings yet
Teme Pentru Referate La Cursul "Retele Neuronale"
3 pages