Machine Learning Models
Machine Learning Models
Machine
Learning Models
1
1
Topics to be covered
Learning
overfitting)
► 2.4 Metrics for evaluation viz. accuracy, scalability, squared error, precision
2
► 2.5 Classification Accuracy and Performance
2
Types of Learning
3
Supervised Learning
4
Supervised Learning
pediction
5
Supervised Learning: Example 1
6
Supervised Learning: Example 1
7
Unsupervised Learning
8
Unsupervised Learning: Example
9
Unsupervised Learning
10
10
Unsupervised Learning: Examples
11
Unsupervised Learning: Examples
12
12
Reinforcement learning
Simulation
13
Reinforcement Learning: Example
14
Reinforcement Learning: Applications
Gaming Robotics
15
Semi-supervised Learning
16
Semi-supervised Learning: Example
17
Semi-supervised learning: Application
18
19
Generalization Error
20
Components of Generalization Error
Bias Variance
Varianc Underfitting Overfitting
e
21
Prediction error
22
Bias
23
Variance
► Variance is the variability of model prediction for a given data point or a value
which tells us spread of our data.
Model with high variance pays a lot of attention to training data and does not
generalize on the data which it has not seen before.
► As a result, such models perform very well on training data but has high error rates
on test data.
► The Variance is when the model takes into account the fluctuations in the data i.e.
the noise as well.
► In simple terms, think of variance as the error rate of the testing data
► When the error rate is high, we call it High Variance and when the error rate is low,
we call it Low Variance
24
Underfitting
► When a model is unable to capture the essence of the training data properly because of a
low number of parameters then this phenomenon is known as Underfitting.
► When the model has a high error rate in the training data, we can say the model is
underfitting
► This can also happen in cases where we have very less amount of data to build an
accurate model.
► Since our model performs badly on the training data, it consequently performs
badly on the testing data as well.
► A high error rate in training data implies a High Bias, therefore In simple terms,
High Bias implies underfitting
25
Overfitting
► When a model is built using so many predictors that it captures noise along with the underlying pattern
► It tries to fit the model too closely to the training data leaving very less scope for generalizability. This
phenomenon is known as Overfitting.
► When the model has a low error rate in training data but a high error rate in testing data, we can say
the model is overfitting.
► This usually occurs when the number of training samples is too high or the hyperparameters have
been tuned to produce a low error rate on the training data.
► A low error rate in training data implies Low Bias whereas a high error rate in testing data implies a
High Variance, therefore, In simple terms, Low Bias and High Variance implies overfitting
26
Bias-Variance, Underfitting overfitting
27
ML System cycle
4 1
Maintenance Ideation
3 2
Production Development
28
1. Ideation
29
2. Development
► Once key metrics that correspond to the business objectives are agreed upon and historical
data is acquired, the data scientist can start developing the initial model.
► Data scientists have a wide array of tools available to solve their puzzles:
30
3.Production
► When the development phase is over, the developed model needs to be put in production
► The complexity of getting a model in production depends on the context of the problem,
the autonomy of data science teams and the overall maturity of the organization.
► Once a model is deployed, there are a number of measures that can be taken to improve the
robustness and quality of the machine learning model.
► These measures can be roughly divided into four areas. We call this post- production
process maintenance.
1. Lineage: The lineage of a machine learning model refers to the origins of the model, including
which source code the model uses, which data it was training on, and what parameters were
used.
4. Model Drift: Machine learning models can become obsolete when not maintained properly, this
concept is called model drift
32
Evaluation Metrics
33
Confusion Matrix
► Evaluation of the performance of a classification model is based on the counts of test records
34
► The rows represent the predicted values of the target variable
Confusion Matrix
► The actual value was positive and the model predicted a positive value
► The actual value was negative and the model predicted a negative value
► The actual value was negative but the model predicted a positive value
► The actual value was positive but the model predicted a negative value
35
Accuracy Example
36
Accuracy
► Accuracy is a useful metric only when you have an equal distribution of classes in your
classification.
► Accuracy is not a good metric to use when you have a class imbalance.
37
Imbalanced Data Example
► Imagine you are working on the sales data of a website. You know that 99% of website
You are building a classification model to predict which website visitors are buyers and
► Now imagine a model that doesn’t work very well. It predicts that 100% of your visitors
are just lookers and that 0% of your visitors are buyers. It is clearly a very wrong and
useless model.
► What would happen if we’d use the accuracy formula on this model? Your model has
predicted only 1% wrongly: all the buyers have been misclassified as lookers. The
percentage of correct predictions is therefore 99%. The problem here is that an accuracy
of 99% sounds like a great result, whereas your model performs very38 poorly.
Solving Imbalance In The Data
► You can resample your data set in such a way that the data is not imbalanced anymore. You can then use
► There are methods including undersampling, oversampling, and SMOTE data augmentation for
resampling
► Another way to solve class imbalance problems is to use better accuracy metrics like the F1 score,
which take into account not only the number of prediction errors that your model makes, but that also
39
Precision
► With in everything that has been predicted as a positive, precision counts the percentage that is correct
► Precision: Out of all the positive classes, how much we predicted correctly?
► A not precise model may find a lot of the positives, but its selection method is noisy: it also wrongly
► A precise model is very “pure”: maybe it does not find all the positives, but the ones that the model
40
Recall
► Within everything that actually is positive, how many did the model succeed to find
► A model with high recall succeeds well in finding all the positive cases in the data, even
though they may also wrongly identify some negative cases as positive cases.
► A model with low recall is not able to find all (or a large part) of the positive cases in the
data.
41
Precision-Recall Trade-Off
► If you increase precision, it will reduce recall and vice versa. This is called the
precision/recall tradeoff.
► The Precision-Recall Trade-Off represents the fact that in many cases, you can tweak a
model to increase precision at a cost of a lower recall, or on the other hand increase recall
42
Precision and Recall Example
43
F1 Score
► Precision and Recall are the two building blocks of the F1 score.
► The goal of the F1 score is to combine the precision and recall metrics into a single metric.
► Since the F1 score is an average of Precision and Recall, it means that the F1 score gives equal
weight to Precision and Recall:
► A model will obtain a high F1 score if both Precision and Recall are high
► A model will obtain a low F1 score if both Precision and Recall are low
► A model will obtain a medium F1 score if one of Precision and Recall is low and the other is high
► An F1 score is considered perfect when it’s 1, while the model is a total failure when it’s 0.
44
F1-score
► A good F1 score means that you have low false positives and low false negatives, so you’re
correctly identifying real threats, and you are not disturbed by false alarms.
► An F1 score is considered perfect when it’s 1, while the model is a total failure when it’s 0.
45
Mean Squared Error (MSE)
► The Mean Squared Error (MSE) is perhaps the simplest and most common
► To calculate the MSE, you take the difference between your model’s predictions and the ground
truth, square it, and average it out across the whole dataset.
► The MSE will never be negative, since we are always squaring the errors.
46
Mean Absolute Error
(MAE)
► To calculate the MAE, you take the difference between your model’s
predictions and the ground truth, apply the absolute value to that difference,
and then average it out across the whole dataset.
► The MAE, like the MSE, will never be negative since in this case we are always
taking the absolute value of the errors.
► The MAE is formally defined by the following equation:
47
Root Mean Squared Error (RMSE)
► RMSE is the standard deviation of the errors which occur when a prediction is
made on a dataset.
► This is the same as MSE (Mean Squared Error) but the root of the value is
considered while determining the accuracy of the model.
► The RMSE is formally defined by the following equation:
48
ML Scalability
► ML scalability is scaling ML models to handle massive
data sets and perform many computations in a cost-
effective and time-saving way.
ML scalability with an example:
► A model built to predict stock prices consumes data from
a large dataset and delivers prediction instantly. These
predictions are relevant for a limited timeframe, and
delayed predictions become meaningless from the user’s
perspective. Stock prices are super dynamic by nature
and hence getting an instant stock prediction is very
important here. Scalability comes as a rescue in such
situations. It allows scaling ML models to serve millions of
users and fits well for big data.
49
Check your understanding
50
Check your understanding
51
Check your understanding
► Which of the following would help to increase the value of precision? (Multiple
answers possible)
A. Increasing true positive
B. Increasing true negative
C. Decreasing false positive
D. Decreasing false negative
52
Check your understanding
53
References
► https://towardsdatascience.com/the-f1-score-bec2bbc38aa6
► https://medium.com/analytics-vidhya/precision-recall-tradeoff-
79e892d43134
54