4-ResamplingMethods 1

Uploaded by

Roushan Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

4-ResamplingMethods 1

Uploaded by

Roushan Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Machine Learning

Samatrix Consulting Pvt Ltd

Resampling Methods
Resampling Methods
• Resampling methods involve draw samples from a training set repeatedly.
• Then refit the model of interest on each sample. In this process, we obtain
additional information about the fitted model.
• This information would not be available if we fit the model only once using
the original training sample.
• The resampling methods involve fitting a machine learning method
multiple times by using different subset of training data.
• Hence the resampling methods may be computationally expensive.
• However, due to recent advances in computing power, the resampling
methods are no longer prohibitive.
Resampling Methods
• The most commonly used resampling methods are cross-validation and
bootstrap.
• For example, we can use cross-validation to estimate the error associated
with a machine learning method to evaluate its performance or select the
level of flexibility.
• The process of evaluating the performance of a model is called model
assessment.
• Whereas the process of selecting the level of flexibility of a model is called
the model selection.
• We use bootstrap methods to measure the accuracy of a given machine
learning methods.
Resampling Methods
• In the previous chapters, we discussed how the test error rates and
the training error rates could be different from each other.
• If we do not have a very large test data set to estimate the test error
rate, we can use different techniques to estimate the test error rate
using the available training data.
• By using a class of methods, we can estimate the test error rates by
holding out a subset of training observation from the fitting process
and then applying the machine learning methods to the help out
observations.
Validation Set Approach
• Validation set approach is used to estimate the test error rate.
• In this approach, we randomly divide the available set of observations
into two parts: training set and validation set or hold-out set.
• We use the training set to fit the model and then use the fitted model
to predict the responses for the observations in the validation set.
• We use the validation set error rate to estimate the test error rate.
Validation Set Approach
• We have demonstrated the schematic display of the validation set
approach in the Figure 1.
• We have randomly split a set of 𝑛 observations into a training set and
a validation set.
• The training set is shown in blue and it contains observations,
7, 22, and 13 among others.
• The validation set is shown in beige and it contains observation 91
among others.
• We use a machine learning model to fit on the training set and use
validation set to evaluate its performance.
Validation Set Approach
• Even though the validation set approach is conceptually simple and is
easy to implement, it suffers two drawbacks:
• The variance of the estimate of the test error rate using validation approach
can be high because it depends how the observations have been split
between the training set and the validation set
• In the validation approach we use only a subset of the observations to fit the
model. The machine learning models do not perform well on fewer
observation and the validation set error rate may be an overestimation of the
test error rate for the model that has been fit on the entire data set.
Leave-one-out Cross-Validation
• Leave-One-Out Cross Validation (LOOCV) is similar to the validation set
approach but it ties to address the drawbacks of validation approach.
• Similar to the validation set approach, the LOOCV also split the data set
into two parts.
• Instead of creating two subsets of comparable size, we use only one
observation 𝑥1 , 𝑦1 for the validation set and use the remaining (𝑛 − 1)
observations { 𝑥2 , 𝑦2 , … , 𝑥𝑛 , 𝑦𝑛 } for the training set.
• We fit a machine learning model on the (𝑛 − 1) training observations,
make prediction 𝑦ො1 using the excluded observation 𝑥1 , and calculate the
𝑀𝑆𝐸1 = 𝑦1 − 𝑦ො1 2 .
• But the 𝑀𝑆𝐸1 is highly variable because it is based on a single observation.
Leave-one-out Cross-Validation
• Now we repeat the procedure by selecting
𝑥2 , 𝑦2 for validation data, training machine
learning model on 𝑛 − 1 observation
• 𝑥1 , 𝑦1 , 𝑥3 , 𝑦3 … , 𝑥𝑛 , 𝑦𝑛 , and
computing 𝑀𝑆𝐸2 = 𝑦2 − 𝑦ො2 2 .
• If we repeat this approach 𝑛 times, we get 𝑛
squared errors 𝑀𝑆𝐸1 , … , 𝑀𝑆𝐸𝑛 .
• The LOOCV estimate for the test 𝑀𝑆𝐸 can be
calculated by taking the average of all these
𝑛 test error estimates: 𝑛
1
𝐶𝑉 𝑛 = ෍ 𝑀𝑆𝐸𝑖
𝑛
𝑖=1
Leave-one-out Cross-Validation
• LOOCV has two major advantages over the validation set approach.
• Firstly, LOOCV has far less bias as compared to validation set
approach.
• In the case of LOOCV, we repeatedly fit the machine learning model
on the 𝑛 − 1 training observations (almost as many as are in the
entire data set), whereas in the case of validation set approach, we
split the training set into two comparable sizes.
• Hence the LOOCV approach does not overestimate the test error rate
as much as the validation set approach tends to overestimate.
Leave-one-out Cross-Validation
• Secondly, due to randomness in the training/validation set splits, the
validation set approach will yield different results when applied
repeatedly.
• While if we perform LOOCV multiple times, we will always get the
same results because there is no randomness in the
training/validation set splits.
• On the other hand, LOOCV is computationally expensive because the
model has to be fit 𝑛 times.
• If 𝑛 is large and each individual model is slow to fit, this can be very
time consuming.
Leave-one-out Cross-Validation
• LOOCV is a general method. We can use it for any kind of predictive
modeling.
• We can also use it with logistic regression or linear discriminant
analysis
k-fold Cross Validation
• 𝑘-fold cross validation is an alternative to LOOCV. In this method, we
randomly divide the data set into 𝑘 groups or folds of approximately
equal size.
• We treat the first fold as a validation set and a machine learning
method is fit on the remaining 𝑘 − 1 folds.
• We calculate the mean squared error,𝑀𝑆𝐸1 , on the observations in
the held-out fold.
• We repeat the procedure 𝑘 times by treating a different group of
observations as a validation set.
k-fold Cross Validation
• Thus, we get 𝑘 estimates of the test
error 𝑀𝑆𝐸1 , 𝑀𝑆𝐸2 , … , 𝑀𝑆𝐸𝑘 . Finally, we
compute 𝑘-fold CV estimate by
averaging these values
𝑘
1
𝐶𝑉 𝑘 = ෍ 𝑀𝑆𝐸𝑖
𝑘
𝑖=1
• A schematic of the 𝑘-fold cross
validation approach has been
demonstrated in Figure – 3.
k-fold Cross Validation
• We can see that LOOCV is a special case of 𝑘-fold CV in which 𝑘 = 𝑛.
• In practice, we typically perform 𝑘-fold CV using 𝑘 = 5 or 𝑘 = 10.
• The question arises about the advantage of using 𝑘 = 5 or 𝑘 = 10
instead of 𝑘 = 𝑛.
• One of the major advantages is computational.
• The LOOCV requires fitting the learning method 𝑛 times, which makes
LOOCV computationally expensive when compared to the 𝑘-fold CV
using 𝑘 = 5 or 𝑘 = 10.
• For example, if we take 𝑘 = 10, the 10-fold CV requires fitting the
learning method only 10 times which may be much more feasible.
The Bootstrap
• The bootstrap is widely acceptable and very powerful tool.
• It is used to quantity the uncertainty that are associated with a given
machine learning method.
• For example, we can use the bootstrap to estimate the standard
errors of the coefficients from a linear regression fit.
• It may not be useful for linear regression fit because standard
statistical software such as R and Python automatically provide
standard error.
• For a wide range of machine learning methods, the standard errors
are difficult to obtain. We can use the bootstrap for such methods.
The Bootstrap
• The bootstrap estimate quantities about a population by averaging
estimates from small data samples.
• The samples are constructed by drawing observations from a large
data sample one at a time and returning to the sample after they
have been chosen.
• This allows a given observation to be included in a given sample more
than once.
• This approach of sample is known as sampling with replacement.
The Bootstrap
• In this example, we have a simple
data set which we can call 𝑍 and it
contains only 𝑛 = 3 observations.
• From this dataset, we randomly select
𝑛 observations to produce
∗1
a
bootstrap dataset 𝑍 .
• We performed this sampling by
replacement due to which same
observation can be taken in the
bootstrap dataset more than once.
• In the current example, for 𝑍 ∗1
dataset, we have the first observation
once and the third observation twice
but the second observation is not
present in the dataset.
The Bootstrap
• We can use the 𝑍 ∗1 to calculate
the bootstrap estimate for 𝛼,
which we can call 𝛼ො ∗1 .
• We repeat this procedure 𝐵 times
by keeping a large value of 𝐵.
• Thus, we produce 𝐵 different
bootstrap data sets,
𝑍 ∗1 , 𝑍 ∗2 , … , 𝑍 ∗𝐵 .
• We also produce corresponding 𝐵
𝛼 estimates, 𝛼ො ∗1 , 𝛼ො ∗2 , … , 𝛼ො ∗𝐵 .
• We can now compute the standard
error of these bootstrap estimates.
Subset Selection
• There are two reasons why we are not satisfied with the least squares
estimates.
• The first is prediction accuracy: the least squares estimates often have low
bias but large variance. Prediction accuracy can sometimes be improved by
shrinking or setting some coefficients to zero. By doing so we sacrifice a
little bit of bias to reduce the variance of the predicted values, and hence
may improve the overall prediction accuracy.
• The second reason is interpretation. With a large number of predictors, we
often would like to determine a smaller subset that exhibit the strongest
effects. In order to get the "big picture", we are willing to sacrifice some of
the small details
Shrinkage Methods
• By retaining a subset of the predictors and discarding the rest, subset
selection produces a model that is interpretable and has possibly
lower prediction error than the full model.
• However, because it is a discrete process—variables are either
retained or discarded—it often exhibits high variance, and so doesn't
reduce the prediction error of the full model.
• Shrinkage methods are more continuous, and don't suffer as much
from high variability.
Thanks
Samatrix Consulting Pvt Ltd

Sale Deed FORMAT FOR INDUSTRIAL LAND
67% (3)
Sale Deed FORMAT FOR INDUSTRIAL LAND
17 pages
Crocket World 2011 - June
100% (4)
Crocket World 2011 - June
68 pages
MI_Unit 5
No ratings yet
MI_Unit 5
72 pages
Resampling Methods - ML
No ratings yet
Resampling Methods - ML
115 pages
Machine Learning
No ratings yet
Machine Learning
24 pages
DATA ANALYSIS UNIT 4 Notes
No ratings yet
DATA ANALYSIS UNIT 4 Notes
19 pages
KNN_Bias_Variance_Classification_Metrics (1)
No ratings yet
KNN_Bias_Variance_Classification_Metrics (1)
81 pages
Week7_Lecture_1_ML_SPR25 (1)
No ratings yet
Week7_Lecture_1_ML_SPR25 (1)
23 pages
Accuracy Measures
No ratings yet
Accuracy Measures
61 pages
Ch5 Resampling Methods
No ratings yet
Ch5 Resampling Methods
66 pages
5 CV Boot-Handout PDF
No ratings yet
5 CV Boot-Handout PDF
44 pages
Statistical Learning: Master in Data Science For Management
No ratings yet
Statistical Learning: Master in Data Science For Management
47 pages
CH 05 Optimization Technique
No ratings yet
CH 05 Optimization Technique
58 pages
Unit IV
No ratings yet
Unit IV
51 pages
Unit 5 New
No ratings yet
Unit 5 New
9 pages
ML Mod 5
No ratings yet
ML Mod 5
58 pages
ML - Module 5
No ratings yet
ML - Module 5
80 pages
ML Notes (Module-3)
No ratings yet
ML Notes (Module-3)
21 pages
CV
No ratings yet
CV
37 pages
2023 LSE MY474 Applied Machine Learning Social Science, Lecture4
No ratings yet
2023 LSE MY474 Applied Machine Learning Social Science, Lecture4
57 pages
Notes - Unit 3 - Machine Learning Lnctu-bca (Aida) - IV Sem - (1)
No ratings yet
Notes - Unit 3 - Machine Learning Lnctu-bca (Aida) - IV Sem - (1)
19 pages
M1 - Evaluating Predictive Performance
No ratings yet
M1 - Evaluating Predictive Performance
58 pages
Over Fit
No ratings yet
Over Fit
63 pages
1729585037_ML11_Generalization
No ratings yet
1729585037_ML11_Generalization
40 pages
ML 1 Lecture 2
No ratings yet
ML 1 Lecture 2
50 pages
Project 03: Data Fitting Applied Mathematics and Statistics For Information Technology
No ratings yet
Project 03: Data Fitting Applied Mathematics and Statistics For Information Technology
17 pages
CSO504 Machine Learning: Evaluation and Error Analysis Validation and Regularization Koustav Rudra 22/08/2022
No ratings yet
CSO504 Machine Learning: Evaluation and Error Analysis Validation and Regularization Koustav Rudra 22/08/2022
28 pages
ML Module Iii
No ratings yet
ML Module Iii
12 pages
Resampling Methods
No ratings yet
Resampling Methods
15 pages
Module 5 Advanced Classification Techniques
No ratings yet
Module 5 Advanced Classification Techniques
40 pages
Lecture Slide 02 - Supervised Learning - Summer 2023
No ratings yet
Lecture Slide 02 - Supervised Learning - Summer 2023
43 pages
ML 5
No ratings yet
ML 5
14 pages
Classification
No ratings yet
Classification
4 pages
Lecture 19
No ratings yet
Lecture 19
25 pages
Practical Issues
No ratings yet
Practical Issues
30 pages
Berrar_EBCB_2nd_edition_Cross-validation_preprint
No ratings yet
Berrar_EBCB_2nd_edition_Cross-validation_preprint
13 pages
Chapter 5 Learning Deterministic Models
No ratings yet
Chapter 5 Learning Deterministic Models
28 pages
Lect_03_Evaluation_Part_2
No ratings yet
Lect_03_Evaluation_Part_2
40 pages
Wk07 Topic07 2 - 202303
No ratings yet
Wk07 Topic07 2 - 202303
21 pages
P-2.1.2 Cross Validation and Regularization
No ratings yet
P-2.1.2 Cross Validation and Regularization
37 pages
3. Cross Validation
No ratings yet
3. Cross Validation
16 pages
ADS
No ratings yet
ADS
20 pages
ml_cheat (1)
No ratings yet
ml_cheat (1)
9 pages
Resampling Methods Class 2
No ratings yet
Resampling Methods Class 2
38 pages
INSY662 - F23 - Week 3-1
No ratings yet
INSY662 - F23 - Week 3-1
22 pages
15-The Bias - Variance - Trade-Off-08-04-2024
No ratings yet
15-The Bias - Variance - Trade-Off-08-04-2024
23 pages
ML Nithish
No ratings yet
ML Nithish
16 pages
Theory in Machine Learning
No ratings yet
Theory in Machine Learning
60 pages
ML U-4
No ratings yet
ML U-4
63 pages
Chapter2 1 33
No ratings yet
Chapter2 1 33
18 pages
07 - Evaluating Performance
No ratings yet
07 - Evaluating Performance
46 pages
EDA Module 2
No ratings yet
EDA Module 2
28 pages
ML-4th Unit
No ratings yet
ML-4th Unit
44 pages
EUC1502 Module2 Machine Learning
No ratings yet
EUC1502 Module2 Machine Learning
32 pages
EMBED LEC MIDTERM REVIEWER
No ratings yet
EMBED LEC MIDTERM REVIEWER
14 pages
MLquestions
No ratings yet
MLquestions
26 pages
Resampling Methods Class 1
No ratings yet
Resampling Methods Class 1
33 pages
On Estimating Model Accuracy
No ratings yet
On Estimating Model Accuracy
6 pages
Lec 16
No ratings yet
Lec 16
18 pages
Certified Lean Six Sigma Green Belt (ICGB) Practice Questions And Exam Tests ICGB Exam Guidebook And Updated Questions
From Everand
Certified Lean Six Sigma Green Belt (ICGB) Practice Questions And Exam Tests ICGB Exam Guidebook And Updated Questions
Idea Link
No ratings yet
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
From Everand
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
SUJAUL CHOWDHURY
No ratings yet
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Book List For Class - Ix FOR 2020-21
No ratings yet
Book List For Class - Ix FOR 2020-21
6 pages
Dirty Enid's Awesome Alternative Pub Quiz: Questions & Answers
0% (1)
Dirty Enid's Awesome Alternative Pub Quiz: Questions & Answers
5 pages
Fourth Quarter - Illustration 9
No ratings yet
Fourth Quarter - Illustration 9
3 pages
Business Communication
No ratings yet
Business Communication
60 pages
CrackTheCase 101 2019
No ratings yet
CrackTheCase 101 2019
5 pages
SpeedyBee F405 V3 Stack
No ratings yet
SpeedyBee F405 V3 Stack
15 pages
Direct and Indirect Speech
No ratings yet
Direct and Indirect Speech
8 pages
Physics Theory Grade 12
No ratings yet
Physics Theory Grade 12
6 pages
Iv - Semister: Pharmaceutical Organic Chemistry - Iii (Theory)
No ratings yet
Iv - Semister: Pharmaceutical Organic Chemistry - Iii (Theory)
9 pages
Ch-11 - Isu-Isu Global
No ratings yet
Ch-11 - Isu-Isu Global
18 pages
Introduction To Management Chapter 4
No ratings yet
Introduction To Management Chapter 4
29 pages
Credentials
No ratings yet
Credentials
25 pages
Nle11mf2 105g6197 R134a 220v 50hz 10-2017 Desd410k202
No ratings yet
Nle11mf2 105g6197 R134a 220v 50hz 10-2017 Desd410k202
3 pages
SDS Closer GF2372
No ratings yet
SDS Closer GF2372
11 pages
A) Preliminary Pages: June 2020
No ratings yet
A) Preliminary Pages: June 2020
5 pages
N9000A CXA X-Series Signal Analyzer N900 XS: Confi Guration Guide
No ratings yet
N9000A CXA X-Series Signal Analyzer N900 XS: Confi Guration Guide
12 pages
phase 12
No ratings yet
phase 12
3 pages
The Six Major American Genres Film Techniques
No ratings yet
The Six Major American Genres Film Techniques
7 pages
Assignment - 1 - Fundamentals of Compressible Flow
No ratings yet
Assignment - 1 - Fundamentals of Compressible Flow
1 page
CO LESSON PLAN MAPEH 6 HEALTH Quiseo
No ratings yet
CO LESSON PLAN MAPEH 6 HEALTH Quiseo
4 pages
TCS Latest Pattern Questions With Solutions - 18
No ratings yet
TCS Latest Pattern Questions With Solutions - 18
4 pages
Acute Management of Burn and Electrical Trauma
No ratings yet
Acute Management of Burn and Electrical Trauma
35 pages
It Looks Like A Toyota:: Educational Approaches To Designing For Visual Brand Recognition
No ratings yet
It Looks Like A Toyota:: Educational Approaches To Designing For Visual Brand Recognition
15 pages
O Level English Language 1123: Unit 7: Reading A Variety of Texts
No ratings yet
O Level English Language 1123: Unit 7: Reading A Variety of Texts
3 pages
Micro Presentation - TAPMI Interview Round
No ratings yet
Micro Presentation - TAPMI Interview Round
53 pages
Cs1404: Internet Programming Lab: List of Experiments
75% (4)
Cs1404: Internet Programming Lab: List of Experiments
60 pages
PG Medical Question Bank
No ratings yet
PG Medical Question Bank
12 pages
Today's Topics To Be Read (TTBR) : (The Tribune) (The Tribune)
No ratings yet
Today's Topics To Be Read (TTBR) : (The Tribune) (The Tribune)
25 pages

4-ResamplingMethods 1

Uploaded by

4-ResamplingMethods 1

Uploaded by

Machine Learning

Samatrix Consulting Pvt Ltd

You might also like