MACHINE LEARNING – PART 20
EVALUATION METRICS – CLASSIFICATION METRICS
EVALUATION METRICS
These performance metrics help us
understand how well our model has
performed for the given data
CLASSIFICATION METRICS
Accuracy
Recall
Precision
F1 Score
Confusion Metrics
ACCURACY
Imagine you've built a spam email filter, a program that predicts whether an
incoming email is spam or not. You feed it 100 emails to test its performance.
Your filter correctly identifies 80 emails as not spam.
It correctly identifies 15 emails as spam.
However, it makes mistakes and classifies 5 non-spam emails as spam.
To calculate accuracy:
Accuracy = (Number of Correct Predictions) / (Total Number of Predictions)
Accuracy = (80 + 15) / 100 = 95%
IMPORTANT TERMS
True Positive (TP) is an outcome where the model correctly predicts
the positive class.
True Negative (TN) is an outcome where the model correctly
predicts the negative class.
False Positive (FP) is an outcome where the model incorrectly
predicts the positive class.
False Negative (FN) is an outcome where the model incorrectly
predicts the negative class.
PRECISON AND RECALL
Precision is a measure of how many of the positive
predictions made by a classification model were
actually correct.
Recall is a measure of how many of the actual
positive instances in the dataset were correctly
predicted by the model
Suppose you have a medical test that detects a rare disease. Out of 100 patients who
took the test, only 10 have the disease, and 90 do not. The test results are as follows:
•True Positives (correctly predicted as having the disease): 7
•False Positives (incorrectly predicted as having the disease): 3
•False Negatives (incorrectly predicted as not having the disease): 3
•True Negatives (correctly predicted as not having the disease): 87
Now, let's calculate precision and recall:
•Precision: 77+3=0.77+37=0.7 (70%)
This means that of all the patients predicted as having the disease, 70% truly have
it.
•Recall: 77+3=0.77+37=0.7 (70%)
This means that of all the patients who actually have the disease, the test correctly
identified 70% of them.
F1 SCORE
The F1 score is a way to balance precision and recall.
•True Positives (TP) = 90 (successful digs).
•False Positives (FP) = 10 (unnecessary digs in the wrong places).
•False Negatives (FN) = 10 (missed opportunities to dig and find treasure).
Let's calculate precision and recall first:
•Precision (P) = 90/90+10=90/100=0.9 (90% precision).
•Recall (R) = 90/90+10=90/100=0.9 (90% recall).
Now, let's calculate the F1 score using the formula:
•F1 Score = 2×(0.9×0.90/9+0.9)=2×(0.811/8)=0.9 (90% F1 score)
CONFUSION MATRIX
Actual Treasure Actual No Treasure True Positives (TP) = 90 (successful digs
(Positive) (Negative) where your device correctly signaled you to
dig, and you found treasure).
True Negatives (TN) = 10 (cases where
Predicted Treasure 90 10 your device correctly signaled you not to
dig, and there was no treasure).
Predicted No Treasure 10 10 False Positives (FP) = 10 (instances where
your device incorrectly signaled you to dig,
but there was no treasure).
False Negatives (FN) = 10 (cases where
your device incorrectly signaled you not to
dig, but there was treasure).