0% found this document useful (0 votes)

86 views32 pages

ML Models Concepts

This document summarizes key concepts from lectures on machine learning models, including selecting the right model, overfitting and underfitting, and evaluating model performance. The lectures covered cross-validation techniques, bias-variance tradeoff, loss functions, and evaluating models using accuracy scores and error rates. Selecting the appropriate model depends on factors like the data type and learning task. Overfitting and underfitting occur when models are too complex or simple, respectively, and can be addressed through techniques like regularization.

Uploaded by

Zarfa Masood

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

86 views32 pages

ML Models Concepts

Uploaded by

Zarfa Masood

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

CS446: Machine Learning

Lecture 10-11 (Concepts Relevant to ML Models)

Instructor:
Dr. Muhammad Kabir
Assistant Professor
[email protected]

School of Systems and Technology

Department of Computer Science
University of Management and Technology, Lahore
Previous Lectures…
 Label Encoding

 Understanding and Implementing Train Test Split

 Machine Learning Process

 How Machine Learning works?

 What is Learning?

 Understanding Curve fitting in Machine Learning

 Complex Model need?

Today’s Lectures…
 Selecting right ML model
 Cross-Validation – Statistical measures
 Overfitting in ML – concept, signs, causes and prevention
 Under fitting in ML – concept, signs, causes and prevention
 Bias-variance tradeoff in ML
 Loss function
 Model evaluation – Accuracy score, mean score error
 Model parameters and hyperparameter
 Gradient decent in ML
Machine Learning Process….
Model Selection…
 The process of choosing the best suited model for a particular problem
is called model selection.
 It depends on various factors such as the dataset, task, nature of the
model etc.
Model Selection…
 Model selection depends on
1. Type of Data
a. Images & videos – CNN
b. Text or Speech – RNN
c. Numerical data – SVM, Logistic Regression,
Decision Trees etc.
2. Tasks we need to carry out
a. Classification tasks – SVM, LR, DT
b. Regression tasks – Linear Regression,
Random forest, polynomial reg.
c. Clustering tasks – K-means, Clustering,
Hierarchical clustering
Model Validation methods…
 In ML approaches, the error rate is commonly utilized to appraise the
performance of the classifiers and for this purpose, the benchmark dataset
is divided into different portions.
 Usually, two statistical methods called holdout and cross-validation are
used for the partitioning of the benchmark dataset.
1. Holdout
a. Some data is kept for testing, and the remaining is reserved for training.
b. Generally, one third for testing and two-third for the training.
2. Cross-validation
a. Dataset is partitioned into a fixed-size of mutually exclusive folds.
b. three renowned statistical cross-validation tests
c. subsampling test (k-fold), jackknife, and independent test
Cross-validation methods…
1. Jackknife
a. Leave-one-out cross-validation (LOOCV).
b. the dataset is fragmented into n-folds.
c. One for testing and n-1 for training
d. maximum data is utilized for training
e. Time-consuming
f. Result didn’t change
Cross-validation methods…

2. K-fold
a. Dataset is divided into
K subsets without
repetition.
b. Each time uses one of
the subsets as the test
set, and the remaining
K-1 subsets as the
training set repeats K
times to learn K-1
classification models
Cross-validation methods…
3. Independent test
a. The independent dataset test is one of the essential tests,
employed to assess the generalization performance of model in
the field of ML.
b. The model is trained to deploy a benchmark dataset, while an
independent dataset is utilized for its testing.
c. The data, consumed in the training phase are not used for the
testing phase, which means that the data in the benchmark
dataset and independent dataset are dissimilar.
d. The main reasons to exploit an independent dataset test are, to
evaluate whether a trained model over-fits over the training
dataset, and to adequately assess the generalization capabilities
of a trained model.
Cross-validation methods…
Cross-validation implementation
Overfitting in ML Models….
 Overfitting refers to a model
behavior that models the training
data too well.
 Overfitting happens when a model
learns the details and noise in the
training dataset to the extent that
it negatively impacts the
performance of the model.

 Sign that the model has overfitted

– high training data accuracy & low
testing data accuracy
Example - Overfitting in ML Models….
Example - Overfitting in ML Models….
Overfitting in ML Models….
Causes of overfitting
 Less data
 Increased complexity of the model
 More layers in Neural Network (NN)

Preventing the overfitting

 Using more data
 Reduce the number of layers in the NN
 Early stopping
 Bias-variance tradeoff
 Using Dropouts (usually in NN and Deep Learning)
Underfitting in ML Models….

 Under-fitting happens when the model does

not learn enough from the data.
 It occurs when a ML model cannot capture
the underlying trend of the data.

 Sign that the model has overfitted – Very Low

training data accuracy & high testing data accuracy
Example - Underfitting in ML Models….
Example - Underfitting in ML Models….
Underfitting in ML Models….
Causes of overfitting
 Choosing the wrong model
 Less complexity of the model
 Less variance but high bias

Preventing the overfitting

 Selecting the correct model appropriate for the problem
 Increasing the complexity of the model
 More number of parameters to the model
 Optimal Bias-variance tradeoff
Bias-variance Tradeoff in ML Models….
Bias
 The bias is known as the difference between the prediction of the
values by the ML model and the correct value.
 Being high in biasing gives a large error in training as well as testing
data.
 Its recommended that an algorithm should always be low biased to
avoid the problem of Underfitting.
Bias-variance Tradeoff in ML Models….
Variance
 The variability of model prediction for a
given data point which tells us spread of
our data is called the variance of the
model. High Variance
 The model with high variance has a very
complex fit to the training data and thus
is not able to fit accurately on the data
which it hasn’t seen before.
 high on variance - overfitting of data.
 The high variance data looks like follows.
High Variance
Bias-variance Tradeoff in ML Models….
Bias-variance Tradeoff in ML Models….
Tradeoff
 Too simple algorithm - may be on high bias
and low variance
 Too complex - may be on high variance and
low bias.
 This tradeoff in complexity is why there is
a tradeoff between bias and variance.
 An algorithm can’t be more complex and
less complex at the same time.
 For the graph, the perfect tradeoff will be
like.
Loss function in ML Models….

 A loss function measures how far an estimated values is from

the true (actual) values – difference between predicted and
actual values.
 It helps to determine which model performs better and
which parameters are better.
Example - Loss function in ML Models….
Example - Loss function in ML Models….
Example - Loss function in ML Models….
ML Models evaluation – Accuracy & Error….
ML Models evaluation – Accuracy & Error….
ML Models evaluation – Accuracy & Error….
ML Models evaluation – Accuracy & Error….
Chapter Reading

Chapter Chapter 01
Pattern Recognition and
Machine Learning
by
Machine Learning
by
Tom Mitchell
Christopher M. Bishop

Network Topologies & Hardware Guide
No ratings yet
Network Topologies & Hardware Guide
14 pages
Understanding Decision Trees
No ratings yet
Understanding Decision Trees
13 pages
Simple Linear Regression Lab II
No ratings yet
Simple Linear Regression Lab II
5 pages
CS446 Machine Learning Course Intro
100% (1)
CS446 Machine Learning Course Intro
46 pages
Support Vector Machine
No ratings yet
Support Vector Machine
17 pages
K-NN Algorithm: A Beginner's Guide
100% (1)
K-NN Algorithm: A Beginner's Guide
3 pages
Machine Learning: Logistic Regression
No ratings yet
Machine Learning: Logistic Regression
19 pages
Lec 06 Clustering
No ratings yet
Lec 06 Clustering
44 pages
MachineLearning MidTerm UMT Spring 2021
100% (1)
MachineLearning MidTerm UMT Spring 2021
12 pages
Generalization/Specialization: Database Systems
No ratings yet
Generalization/Specialization: Database Systems
15 pages
Principal Cover Letter Tips
100% (1)
Principal Cover Letter Tips
5 pages
Petroleum System Elements Overview
No ratings yet
Petroleum System Elements Overview
17 pages
Invitation Letter Ogip 2025
No ratings yet
Invitation Letter Ogip 2025
100 pages
Job Description - Organization Development
No ratings yet
Job Description - Organization Development
2 pages
Slope Stability Analysis Guide
No ratings yet
Slope Stability Analysis Guide
50 pages
Pharmaceutics 1
No ratings yet
Pharmaceutics 1
5 pages
ROS Robotics Workshop Overview
No ratings yet
ROS Robotics Workshop Overview
30 pages
Concrete Design Prokon
100% (3)
Concrete Design Prokon
206 pages
Center of Resistance of Anterior Arch Segment
No ratings yet
Center of Resistance of Anterior Arch Segment
8 pages
Paltridge (2012) What Is Discourse (p.1-14)
No ratings yet
Paltridge (2012) What Is Discourse (p.1-14)
15 pages
Roksan Wire & Cable Compounds 2024
No ratings yet
Roksan Wire & Cable Compounds 2024
20 pages
17S2 EE3010 PPT LECTURE2 Electromagnetism
No ratings yet
17S2 EE3010 PPT LECTURE2 Electromagnetism
32 pages
Iodine Tungsten Lamp Overview
No ratings yet
Iodine Tungsten Lamp Overview
28 pages
2019 Zomba Geog Mock
No ratings yet
2019 Zomba Geog Mock
7 pages
Jmi Catalog 2023 (Web)
No ratings yet
Jmi Catalog 2023 (Web)
124 pages
Additive Manufacturing
No ratings yet
Additive Manufacturing
29 pages
Grade 5 Science: Weather Disturbances
100% (1)
Grade 5 Science: Weather Disturbances
37 pages
Case Studies in Thermal Engineering: Ashkan Alimoradi
No ratings yet
Case Studies in Thermal Engineering: Ashkan Alimoradi
8 pages
Energy Management Assignment
No ratings yet
Energy Management Assignment
10 pages
Strategy Execution in Uganda's Public Sector
No ratings yet
Strategy Execution in Uganda's Public Sector
81 pages
Phil Iri 7 Q&a
No ratings yet
Phil Iri 7 Q&a
2 pages
Methods For Estimating Petrophysical Parameters From Well Logs in Tight Oil Reservoirs: A Case Study
No ratings yet
Methods For Estimating Petrophysical Parameters From Well Logs in Tight Oil Reservoirs: A Case Study
8 pages
STI - SP001-00 - Standard For Inspection of In-Service Shop Fabricated Aboveground Tanks For Storage of Combustible and Flammable Liquids
No ratings yet
STI - SP001-00 - Standard For Inspection of In-Service Shop Fabricated Aboveground Tanks For Storage of Combustible and Flammable Liquids
20 pages
0403 16
No ratings yet
0403 16
5 pages
Q2 MAPEH HEALTH DLP Week 8
No ratings yet
Q2 MAPEH HEALTH DLP Week 8
11 pages
Classic Design and Innovation Task Sheet
No ratings yet
Classic Design and Innovation Task Sheet
7 pages
Lab Rep 1 Chem 120.1 F
No ratings yet
Lab Rep 1 Chem 120.1 F
12 pages
Biological Psychology Insights
No ratings yet
Biological Psychology Insights
51 pages
MSC Chemistry 18 19
No ratings yet
MSC Chemistry 18 19
2 pages
01 Introduction
No ratings yet
01 Introduction
6 pages

ML Models Concepts

Uploaded by

ML Models Concepts

Uploaded by

CS446: Machine Learning

Lecture 10-11 (Concepts Relevant to ML Models)

School of Systems and Technology

 Understanding and Implementing Train Test Split

 Machine Learning Process

 How Machine Learning works?

 Understanding Curve fitting in Machine Learning

 Complex Model need?

 Sign that the model has overfitted

Preventing the overfitting

 Under-fitting happens when the model does

 Sign that the model has overfitted – Very Low

Preventing the overfitting

 A loss function measures how far an estimated values is from

You might also like