0% found this document useful (0 votes)

622 views14 pages

Evaluation Metrics in Machine Learning

Uploaded by

Sahil Mhaske

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

622 views14 pages

Evaluation Metrics in Machine Learning

Uploaded by

Sahil Mhaske

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

MENTORNESS ARTICLE

TASK 1

MIP-ML-08 BATCH

Evaluation Metrics in Machine Learning:

Exploring Performance Assessment
BY NIGAR SULTANA

What Are Evaluation Metrics in Machine Learning?

Evaluation metrics in Machine Learning (ML) are important tools for checking how well

ML models work. These metrics use numbers to show how effective a model is at

handling new information and help decide which model is best. They look at things like

accuracy, precision, recall, and other measures to see if the model is performing well.

The goal is to make sure the model can learn from training and make accurate

predictions on new things it hasn't seen before. These metrics also help pick the best

model from many options and can find where a model needs to improve. This helps ML
experts adjust things to make the model more effective and ensures it does well in real-

life situations.

Why Do We Need Evaluation Metrics in ML?

Evaluation metrics are crucial in ML for several reasons:

• They help us understand if ML models are effective and accurate in their

predictions or classifications.

• These metrics guide us in choosing the best model among several options by

comparing their performances.

• They play a role in tuning model settings (hyperparameters) to make them

perform better.

• Evaluation metrics provide a foundation for constantly improving and fine-tuning

ML algorithms.

• They allow us to objectively assess and compare different models based on their

performance scores.

• These metrics are essential for making informed decisions about which ML

model to use in a given situation.

• They help us identify areas where ML models need improvement or adjustments.

• Using evaluation metrics ensures that ML models meet desired performance

standards and deliver reliable results.

• Overall, evaluation metrics are key tools for assessing, improving, and optimizing

ML models for various applications.

Types of Evaluation Metrics

There are various types of evaluation metrics used in ML, including:

1. Regression Model Evaluation Metrics: These metrics assess how well

regression models predict numerical outcomes. They include:

• Mean Absolute Error (MAE): This measures the average difference

between predicted and actual values, giving an idea of how accurate the

predictions are.

• Root Mean Squared Error (RMSE): Similar to MAE but emphasizes larger

errors, providing a more comprehensive view of prediction accuracy.

• R-squared (R2) Score: It shows how much of the variance in the data is

explained by the model, indicating how well the model fits the data.

2. Classification Model Evaluation Metrics: These metrics evaluate how

accurately classification models classify data into different categories. They

include:

• Accuracy: This measures the overall correctness of the model's

predictions.

• Precision: It shows how many positive predictions were actually correct,

focusing on the accuracy of positive predictions.

• Recall (Sensitivity): This metric indicates how many actual positive

instances the model correctly identified, emphasizing the model's ability to

capture all positive cases.

• F1 Score: The F1 score combines precision and recall into a single value,

offering a balanced assessment of the model's performance in handling

class imbalances.
• Confusion Matrix: A confusion matrix is a tabular representation of a

machine learning model's performance, displaying the counts of true

positive, true negative, false positive, and false negative predictions.

Explaining Each Evaluation Metric in Detail

1. Mean Absolute Error (MAE):

• MAE measures the average magnitude of errors between predicted and

actual values in regression models.

• It provides insights into how accurate the predictions of the model are on

average.

• MAE is calculated by taking the average of the absolute differences

between predicted and actual values.

Formula for MAE:

Where:

• y_j: ground-truth value

• y_hat: predicted value from the regression model

• N: number of datums
Example Graph: Mean Absolute Error

2. Root Mean Squared Error (RMSE):

• RMSE is similar to MAE but gives more weight to larger errors, making it

sensitive to outliers.

• It provides a more comprehensive assessment of model performance by

penalizing significant errors.

• RMSE is calculated by taking the square root of the average of squared

differences between predicted and actual values.

Formula for RMSE:

Example Graph: Root Mean Squared Error

3. R-squared (R2) Score:

• R2 score quantifies the proportion of variance in the dependent variable

explained by independent variables in regression models.

• It indicates the goodness of fit of the regression model, showing how well

the model fits the data.

• R2 score ranges from 0 to 1, where 1 indicates a perfect fit and 0 indicates

no relationship between variables.

Formula for R2 Score:

Where:

• yˉ is the mean of the actual values.

Example Graph: R-Squared Score

[Link]:

• Accuracy is a simple and intuitive metric that measures the percentage of correct

predictions made by a model.

• It is suitable for balanced datasets where the positive and negative classes are

similar in number.

• However, in imbalanced datasets, accuracy can be misleading as it favors the

majority class predictions, neglecting the minority class.

• This can lead to an inaccurate assessment of the model's performance,

especially in scenarios where the minority class is of high importance.

Formula for Accuracy:

Where:

• TP=True Positive

• TN=True Negative

• FP=False Positive

• FN=False Negative

Example Graph: Accuracy

[Link]:

• Precision measures the proportion of correctly predicted positive instances out

of all instances predicted as positive.

• It is particularly valuable when the cost of false positives is significant.

• For instance, in medical diagnosis, high precision indicates accurate

identification of patients with a disease, reducing false positive cases.

Formula for Precision:

Example Graph: Precision

[Link]:

• Recall (also known as Sensitivity) measures the proportion of correctly predicted

positive instances out of all actual positive instances.

• It is crucial in scenarios where capturing all positive cases is vital, even if it

results in some false alarms.

• For instance, in healthcare, high recall ensures that the model doesn't miss

identifying patients with a disease, even if it means some healthy individuals are

flagged for further evaluation.

Formula for Recall:

Where:

TP=True Positive
FN=False Negative

Example Graph: Recall

7.F1 Score:

• The F1 score represents the harmonic mean of precision and recall.

• It offers a balanced assessment of a model's performance by taking into account

both precision and recall.

• Models with a similar balance between precision and recall are favored by the F1

score.

• The harmonic mean is particularly suitable for averaging ratios of values, making

the F1 score valuable in scenarios with imbalanced precision and recall values.
Example Graph: F1 Score

[Link] Matrix:

• The confusion matrix is a tabular representation of true and predicted classes in

a classification problem.

• It displays the four possible combinations of true positives, true negatives, false

positives, and false negatives, offering insights into the model's performance and

areas for improvement.

Formula for Confusion Matrix:

Confusion Matrix

Recap:

In conclusion, evaluation metrics are indispensable tools in ML for assessing model

performance and guiding decision-making processes. Incorporating both regression

and classification model evaluation metrics provides a comprehensive understanding

of a model's capabilities and areas for improvement. The table below summarizes the

key evaluation metrics discussed in this article, along with their descriptions and

formulas:

Metric Description Formula

Mean Measures average
Absolute magnitude of errors
Error between predicted
and actual values in
regression models.
Root Mean Similar to MAE but
Squared penalizes large errors
Error more, sensitive to
outliers.
R-squared Quantifies proportion
(R2) Score of variance in
dependent variable
explained by
independent
variables.
Accuracy Measures proportion
of correct predictions
made by model over
total predictions.
Precision Measures proportion
of correctly predicted
positive instances
out of all predicted
positives.
Recall Measures proportion
(Sensitivity) of correctly predicted
positive instances
out of all actual
positives.
F1 Score Combines precision
and recall into single
value, offering
balanced model
performance
assessment.
Confusion Tabular
Matrix representation of
true and predicted
classes, displaying
counts of true
positives, true
negatives, false
positives, and false
negatives.

By leveraging evaluation metrics effectively, data scientists and ML practitioners can

develop robust and accurate ML models that meet the desired performance standards

and deliver reliable results in real-world applications.

Thank you.

Pattern Recognition in AI
No ratings yet
Pattern Recognition in AI
3 pages
ML Project Guide for Practitioners
No ratings yet
ML Project Guide for Practitioners
7 pages
Ensemble Learning-Bagging-Boosting-Stacking
No ratings yet
Ensemble Learning-Bagging-Boosting-Stacking
12 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
3 pages
Dimensionality Reduction Guide
No ratings yet
Dimensionality Reduction Guide
15 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
5 pages
Machine Learning Classification Guide
No ratings yet
Machine Learning Classification Guide
7 pages
ML - Chapter 6 - Model Evaluation
No ratings yet
ML - Chapter 6 - Model Evaluation
65 pages
7 Types of Neural Network Activation Functions
No ratings yet
7 Types of Neural Network Activation Functions
16 pages
Understanding Decision Trees in Classification
100% (1)
Understanding Decision Trees in Classification
58 pages
RBF Neural Network
No ratings yet
RBF Neural Network
34 pages
Perceptons Neural Networks
No ratings yet
Perceptons Neural Networks
33 pages
CS7641 Machine Learning Midterm Notes PDF
0% (1)
CS7641 Machine Learning Midterm Notes PDF
239 pages
Data Science
100% (1)
Data Science
31 pages
3.4 Lda
No ratings yet
3.4 Lda
12 pages
Artificial Inteligence and Machine Learning
No ratings yet
Artificial Inteligence and Machine Learning
8 pages
Unit 1 DMW
No ratings yet
Unit 1 DMW
41 pages
Gaussian Mixture Model Parameters
No ratings yet
Gaussian Mixture Model Parameters
24 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
18 pages
Jntuk R20 ML Unit-Ii
No ratings yet
Jntuk R20 ML Unit-Ii
37 pages
ML Notes Unit 1-2
No ratings yet
ML Notes Unit 1-2
55 pages
Graph Neural Networks Overview
No ratings yet
Graph Neural Networks Overview
1 page
Pattern Classification: All Materials in These Slides Were Taken From
No ratings yet
Pattern Classification: All Materials in These Slides Were Taken From
44 pages
A Machine Learning Approach For Tracking and Predicting Student Performance in Degree Programs
No ratings yet
A Machine Learning Approach For Tracking and Predicting Student Performance in Degree Programs
2 pages
Probability Statistics
No ratings yet
Probability Statistics
125 pages
Introduction of Machine Learning
No ratings yet
Introduction of Machine Learning
58 pages
Computational Graphs in Deep Learning Unit v4 Deep Leaerning
No ratings yet
Computational Graphs in Deep Learning Unit v4 Deep Leaerning
3 pages
Support Vector Machine - Explanation
No ratings yet
Support Vector Machine - Explanation
12 pages
Machine Learning
No ratings yet
Machine Learning
11 pages
Simple Linear and Logistic Regression
No ratings yet
Simple Linear and Logistic Regression
81 pages
20 Machine Learning Projects For Beginners
No ratings yet
20 Machine Learning Projects For Beginners
22 pages
Data Science Interview Stats Q&A
No ratings yet
Data Science Interview Stats Q&A
5 pages
ML UNIT-2 Notes
No ratings yet
ML UNIT-2 Notes
15 pages
Machine Learning Guide Line
No ratings yet
Machine Learning Guide Line
10 pages
TensorFlow 2 Deep Learning Lab Guide
No ratings yet
TensorFlow 2 Deep Learning Lab Guide
33 pages
Advanced Deep Learning Syllabus
No ratings yet
Advanced Deep Learning Syllabus
2 pages
Crime Prediction in Nigeria's Higer Institutions
No ratings yet
Crime Prediction in Nigeria's Higer Institutions
13 pages
Class Notes Unit 2 ML Material
No ratings yet
Class Notes Unit 2 ML Material
31 pages
Decision Trees for Data Scientists
No ratings yet
Decision Trees for Data Scientists
25 pages
Academic Performance Prediction Based On Multisource, Multi Feature Behavioral Data
No ratings yet
Academic Performance Prediction Based On Multisource, Multi Feature Behavioral Data
6 pages
ML Unit-1
No ratings yet
ML Unit-1
15 pages
Logistic Regression in Machine Learning
No ratings yet
Logistic Regression in Machine Learning
28 pages
Unit 4
No ratings yet
Unit 4
38 pages
ML R23 Material
No ratings yet
ML R23 Material
79 pages
Bagging Vs Boosting in Machine Learning
100% (1)
Bagging Vs Boosting in Machine Learning
4 pages
ML First Unit
0% (1)
ML First Unit
70 pages
Pytorch Tutorial 1
No ratings yet
Pytorch Tutorial 1
48 pages
Introduction To Time Series Analysis
No ratings yet
Introduction To Time Series Analysis
93 pages
Intro to Machine Learning & kNN
No ratings yet
Intro to Machine Learning & kNN
90 pages
Full Stack AI Development Unit 1
0% (1)
Full Stack AI Development Unit 1
2 pages
Ensemble Learning: Bagging, Boosting, Stacking
100% (1)
Ensemble Learning: Bagging, Boosting, Stacking
19 pages
Backpropagation in Neural Networks
No ratings yet
Backpropagation in Neural Networks
56 pages
Machine Learning
No ratings yet
Machine Learning
2 pages
Feature Scaling Techniques: Machine Learning
No ratings yet
Feature Scaling Techniques: Machine Learning
27 pages
Unsupervised Learning Notes
No ratings yet
Unsupervised Learning Notes
21 pages
UNIT III Machine Learning
No ratings yet
UNIT III Machine Learning
14 pages
Machine Learning Lecture Notes
100% (1)
Machine Learning Lecture Notes
54 pages
Model Evaluation
No ratings yet
Model Evaluation
18 pages
Ad3501-Dl-Unit 4 Notes
No ratings yet
Ad3501-Dl-Unit 4 Notes
16 pages
3.4. Evaluation Metrics For AI Models
No ratings yet
3.4. Evaluation Metrics For AI Models
36 pages
PR 10 Program
No ratings yet
PR 10 Program
4 pages
Technical Interview Preparation
No ratings yet
Technical Interview Preparation
12 pages
Sahil Fitness 30-Day Challenge Overview
No ratings yet
Sahil Fitness 30-Day Challenge Overview
10 pages
Java Programming Course Guide
No ratings yet
Java Programming Course Guide
57 pages
Deep Learning Notebook
No ratings yet
Deep Learning Notebook
7 pages
A Register-Set Flip-Flop (RSFF) Is A Type of Flip-Flop Circu443-1
No ratings yet
A Register-Set Flip-Flop (RSFF) Is A Type of Flip-Flop Circu443-1
11 pages
ML Prac 1 Pratiksha
No ratings yet
ML Prac 1 Pratiksha
15 pages
Data Uji Instrumen Soal Pilihan Ganda REVISI
No ratings yet
Data Uji Instrumen Soal Pilihan Ganda REVISI
63 pages
Flora, David
No ratings yet
Flora, David
18 pages
Impact of Enjoyment on WOM Analysis
No ratings yet
Impact of Enjoyment on WOM Analysis
2 pages
CFA Level 2 1712974289
No ratings yet
CFA Level 2 1712974289
19 pages
Chapter 14 1
No ratings yet
Chapter 14 1
83 pages
EC229 Part II Answers
No ratings yet
EC229 Part II Answers
9 pages
Buku Publish Ekonometrika
No ratings yet
Buku Publish Ekonometrika
55 pages
Case-Control Studies in Epidemiology
No ratings yet
Case-Control Studies in Epidemiology
2 pages
Biol 226 Lab Manual 2023 Fall-1
No ratings yet
Biol 226 Lab Manual 2023 Fall-1
14 pages
SDSC3006 - Assignment 3
No ratings yet
SDSC3006 - Assignment 3
4 pages
Linear Regression - Part 1
No ratings yet
Linear Regression - Part 1
32 pages
Chapter 7 Partial Redundancy Analysis - Workshop 10 - Advanced Multivariate Analyses in R
No ratings yet
Chapter 7 Partial Redundancy Analysis - Workshop 10 - Advanced Multivariate Analyses in R
8 pages
UNIT 2 - Linear & Logistic Regression Ppt-Inverted
No ratings yet
UNIT 2 - Linear & Logistic Regression Ppt-Inverted
53 pages
Mid Exam Econometric
No ratings yet
Mid Exam Econometric
6 pages
Lect 3
No ratings yet
Lect 3
55 pages
Dragon Genetics Lab-Part 1-Remote Ulv
No ratings yet
Dragon Genetics Lab-Part 1-Remote Ulv
3 pages
Test Bank For Introduction To Statistical Concepts 4th Edition
No ratings yet
Test Bank For Introduction To Statistical Concepts 4th Edition
8 pages
Empirical Finance Cheat Sheet
No ratings yet
Empirical Finance Cheat Sheet
2 pages
Reading 2 Time Series Analysis
No ratings yet
Reading 2 Time Series Analysis
22 pages
JJJFFF Complete Trade Project Full Document
No ratings yet
JJJFFF Complete Trade Project Full Document
26 pages
A Regression-Kriging Model For Estimation of Rainf
No ratings yet
A Regression-Kriging Model For Estimation of Rainf
10 pages
Practice Questions ML1 CSM354
No ratings yet
Practice Questions ML1 CSM354
5 pages
Handout 3 Non Stationarity
No ratings yet
Handout 3 Non Stationarity
27 pages
Multicollinearity in Econometric Models
No ratings yet
Multicollinearity in Econometric Models
59 pages
Ideal Gas Law: PVNRT - Notebook November 02, 2015
No ratings yet
Ideal Gas Law: PVNRT - Notebook November 02, 2015
8 pages
Regression Estimation for Statisticians
No ratings yet
Regression Estimation for Statisticians
12 pages
Journal of Industrial Engineering - 2017 - Karmaker - A Study of Time Series Model For Predicting Jute Yarn Demand Case
No ratings yet
Journal of Industrial Engineering - 2017 - Karmaker - A Study of Time Series Model For Predicting Jute Yarn Demand Case
8 pages
8 Frequency Measures Used in Epidemiology
No ratings yet
8 Frequency Measures Used in Epidemiology
45 pages
Book 111
No ratings yet
Book 111
3 pages