0% found this document useful (0 votes)
8 views10 pages

Ml Assignment Kv2

Logistic Regression is a widely used supervised machine learning algorithm for predicting binary outcomes based on input variables. It employs the sigmoid function to map predictions to probabilities and has various types including binary, multinomial, and ordinal logistic regression. The document also discusses its mathematical foundation, evaluation metrics, advantages, limitations, and real-world applications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views10 pages

Ml Assignment Kv2

Logistic Regression is a widely used supervised machine learning algorithm for predicting binary outcomes based on input variables. It employs the sigmoid function to map predictions to probabilities and has various types including binary, multinomial, and ordinal logistic regression. The document also discusses its mathematical foundation, evaluation metrics, advantages, limitations, and real-world applications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

1.

Introduction

Logistic Regression is one of the most popular algorithms in supervised


machine learning. It is used to predict the probability of a binary outcome,
such as yes/no, true/false, or success/failure, based on input variables.

Despite its simplicity, logistic regression is widely used in domains like


medical diagnosis, marketing, and social sciences. This assignment explores
logistic regression in depth—its theory, use cases, evaluation, and coding.

2. What is Logistic Regression?

Logistic Regression is a statistical method used for binary classification


problems. It estimates the probability that an instance belongs to a certain
class.

While linear regression predicts continuous outcomes, logistic regression


maps predictions to probabilities using the sigmoid function and classifies
the result as 0 or 1 based on a decision boundary (usually 0.5).

3. Mathematical Foundation

In logistic regression, we compute a weighted sum of input features:

Z = b_0 + b_1x_1 + b_2x_2 + \dots + b_nx_n


We then apply the sigmoid function:

\sigma(z) = \frac{1}{1 + e^{-z}}

This converts the output into a probability between 0 and 1. If the result is ≥
0.5, we classify it as 1 (positive class); otherwise, 0.

4. Sigmoid Function & Decision Boundary

The sigmoid function is:

\sigma(z) = \frac{1}{1 + e^{-z}}

At z = 0, output is 0.5

As z → ∞, output → 1

As z → -∞, output → 0

This output probability helps classify inputs. The decision boundary is the
point where the model decides between class 0 and class 1 (usually at 0.5).
5. Types of Logistic Regression

1. Binary Logistic Regression:

Used when the output has two categories.

Example: Is a customer likely to buy? Yes or No.

2. Multinomial Logistic Regression:

Used when the outcome has more than two unordered categories.

Example: Classifying animals as Dog, Cat, or Rabbit.

3. Ordinal Logistic Regression:

Used when the outcome has ordered categories.

Example: Rating a product as Poor, Average, Good, Excellent.

6. Assumptions of Logistic Regression

1. The dependent variable is binary or ordinal.


2. The model assumes a linear relationship between independent
variables and log-odds.

3. Observations are independent.

4. There is little to no multicollinearity among independent variables.

5. Large sample size improves accuracy.

6. Applications in Real Life

Medical: Predicting if a patient has a disease.

Finance: Classifying if a transaction is fraudulent.

Marketing: Customer segmentation (likely to purchase or not).

Social Media: Spam detection.

Education: Predicting student dropout.


7. Logistic Regression vs Linear Regression

Feature Linear Regression Logistic Regression

Output Type Continuous Probability (0–1)

Used For Regression problems Classification problems

Function Used Linear Function Sigmoid Function

Output Range -∞ to +∞ 0 to 1

Cost Function Mean Squared Error (MSE) Log Loss / Cross Entropy

8. Evaluation Metrics

1. Accuracy:

\frac{TP + TN}{TP + TN + FP + FN}

2. Precision:
\frac{TP}{TP + FP}

3. Recall (Sensitivity):

\frac{TP}{TP + FN}

4. F1-Score:

Harmonic mean of precision and recall.

5. Confusion Matrix:

A matrix showing True Positives, False Positives, True Negatives, and False
Negatives.

9. Regularization in Logistic Regression

Regularization helps prevent overfitting:

L1 Regularization (Lasso):
Can shrink some weights to zero (feature selection).

L2 Regularization (Ridge):

Shrinks all weights but keeps all features.

The cost function with L2 becomes:

Loss = -\sum [y \log(p) + (1 – y) \log(1 – p)] + \lambda \sum w^2

Where λ is the regularization strength.

12. Real-world Projects

Customer Churn Prediction:

Predict whether a user will stop using a telecom service.

Loan Approval:

Classify if a loan application should be approved.

Email Spam Classifier:

Identify if an email is spam or not.

Heart Disease Prediction:

Predict if a person has heart disease based on medical data.


13. Advantages & Limitations

Advantages:

Simple and easy to implement.

Efficient for binary classification.

Interpretable output.

Works well with linearly separable data.

Limitations:

Poor performance on non-linear data.

Assumes linear relationship with log-odds.

Sensitive to irrelevant features and outliers.

14. Conclusion
Logistic regression is a foundational classification technique in machine
learning. It’s widely used due to its simplicity, efficiency, and ability to
handle binary outcomes effectively. With proper evaluation and
preprocessing, logistic regression can produce powerful results in real-world
applications across industries.

15. References

1. “Introduction to Machine Learning” by Ethem Alpaydin

2. Scikit-learn official documentation – https://scikit-learn.org

3. Coursera – Andrew Ng’s ML Course

4. Analytics Vidhya

5. GeeksforGeeks – Logistic Regression Tutorial


Would you like me to generate this as a Word or PDF file for easy printing or
submission?

You might also like