0% found this document useful (0 votes)

4 views36 pages

Ch5 Slide VariableSelection

The document discusses variable selection in regression equations, outlining goals, criteria for model comparison, and computational techniques for selecting variables. It emphasizes the importance of choosing an appropriate subset of regressors to improve model accuracy while managing the trade-off between model complexity and variance. Various criteria and strategies, including adjusted R², Mallow's Cp, and Akaike Information Criterion, are presented for evaluating and selecting models.

Uploaded by

debbyzhuang1129

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views36 pages

Ch5 Slide VariableSelection

Uploaded by

debbyzhuang1129

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

AMA3602

Applied Linear Models

Department of Applied Mathematics

04/2025

0/35
Chapter 5

Variable Selection in Regression Equations

References:
Chapter 10: Variable Selection and Model Building
Montgomery, D.C., Peck, E.A., & Vining, G.G. (2012) Introduction to Linear Regression Analysis.
5th Ed. Wiley.

1/35
Outline

Introduction
Goals of model selection
Criteria to compare models
Model-building problem
Consequences of model misspecification
Criteria for evaluating equations

Computational Techniques for Variables Selection

All possible regression
Stepwise Regression Methods

Strategy for variable selection and model building

2/35
Introduction
Goals of model selection
Criteria to compare models
Model-building problem
Consequences of model misspecification
Criteria for evaluating equations

Computational Techniques for Variables Selection

Strategy for variable selection and model building

Introduction 3/35
Model selection: goals

• When we have many predictors (with many possible interactions), it can be

difficult to find a good model.
• Which main effects do we include?
• Which interactions do we include?
• Model selection tries to “simplify” this task.
• This is an “unsolved” problem in statistics: there are no magic procedures to get
you the “best model”.
• In some sense, model selection is “data mining”.
• Data miners / machine learners often work with very many predictors.

Introduction 4/35
Model selection: strategies

• To “implement” this, we need:

? a criterion or benchmark to compare two models.
? a search strategy.
• With a limited number of predictors, it is possible to search all possible models.

Introduction 5/35
Possible criteria

• R 2 : not a good criterion. Always increase with model size → “optimum” is to

take the biggest model.
• Adjusted R 2 : better. It “penalized” bigger models.
• Mallow’s Cp .
• Akaike’s Information Criterion(AIC), Schwarz’s BIC.

Introduction 6/35
Model-building problem

• Previously our concern: functional specification correct or not; underlying

assumption about the error term valid or not.
• We have employed the classical approach to regression model selection, which
assumes that we have a very good idea of the basic form of the model and that
we know all (or nearly all) of the regressors that should be used.
• In practice, particularly retrospective studies, variable selection problem: find an
appropriate subset of regressors for the model when
1. there is no clear-cut theory to determine the variables;
2. there is a rather large pool of candidate variables;
3. only a few are likely to be important.
• Good variable selection methods are very important in the presence of
multicollinearity
? It help to justify the presence of these highly related regressors in the final
model;
? It does not guarantee elimination of multicollinearity.

Introduction 7/35
• Building a regression model that includes only a subset of the available
regressors involves two conflicting objectives.
1. The model includes as many regressors as possible so that the information
content in these factors can influence the predicted value of y ;
2. The model includes as few regressors as possible because the variance of
the prediction ŷ increases as the number of regressors increases. Also the
more regressors, the greater the costs of data collection and model
maintenance.
The process of finding a model that is a compromise between these two
objectives is called selecting the “best” regression equation.
• None of the variable selection procedures are guaranteed to produce the best
regression equation for a given data set.

Introduction 8/35
Consequences of model misspecification
Consequences of incorrect model specification
• Assume that there are K candidate regressors x1 , . . . , xK and n ≥ K + 1
observations on these regressors and the response y .
K
yi = β 0 + ∑ βj xij + ε i , i = 1, . . . , n or y = Xβ+ε (1)
j =1

• Let r be the number of regressors that are deleted form (1). Then the number of
variables tht are retained is p = K + 1 − r , i.e. the subset model contains
p − 1 = K − r of the original regressors
y = X p βp + X r βr + ε (2)

? The least-squares of β is β̂∗ = (X 0 X )−1 X 0 y

? The estimate of the residual variance σ2 is
y 0 y − β̂∗ 0 X 0 y y 0 [I −X (X 0 X ) −1 X 0 ]y
σ̂2∗ = n−K −1 = n −K −1
• For the subset model
y = X p βp + ε (3)
0
? The least-squares of β p is β̂ p = (X p X p ) −1 X 0 p
y
? The estimate of the residual variance σ2 is
y 0 y − β̂0p X p0 y y 0 [I −X p (X p0 X p )−1 X p0 ]y
σ̂2 = n −p = n −p

Introduction 9/35
• The properties of the estimates β̂ p and σ̂2
? E ( β̂ p ) = β p + (X p0 X p )−1 X p0 X r β r = β p + Aβ r where
A = (X p0 X p )−1 X p0 X r ;
? Var ( β̂ p ) = σ2 (X p0 X p )−1 and Var ( β̂∗ ) = σ2 (X 0 X )−1 . Also
Var ( β̂∗ ) − Var ( β̂ p ) is positive semidefinite;
? Since β̂ p is a biased estimate of β p and β̂∗ is not, it is more reasonable to
compare the precision of the parameter estimates from the full and subset
models in terms of means square error;
? The parameter σ̂2∗ from the full model is an unbiased estimate of σ2 .
β 0 X 0 [I −X p (X 0 X p ) −1 X 0 ]X r β r
However, for the subset model E (σ̂2 ) = σ2 + r r p
n −p
p
.
2 2
That is σ̂ is generally biased upward as an estimate of σ ;
? Suppose we wish to predict the response at the point x 0 = [x p0 , x r0 ]. If we
use the full model, the predicted value is ŷ ∗ = x 0 β̂∗ , with mean x 0 β and
prediction variance Var (ŷ ∗ ) = σ2 [1 + x 0 (X 0 X )−1 x ]

Introduction 10/35
Motivation for variable selection
• Improve the precision of the parameter estimates of the retained variables by
deleting variables from the model, even though some of the deleted variables are
not negligible.
? This is also true for the variance of a predicted response;
? Deleting variables potentially introduces bias into the estimates of the
coefficients of retained variables and the response.
? However, if the deleted variables have small effects, the MSE of the biased
estimates will be less than the variance of the unbiased estimates.
? There is danger in retaining negligible variables, that is, variables with zero
coefficients or coefficients less than their corresponding standard errors
from the full model. This danger is that the variances of the estimates of
the parameters and the predicted response are increased.

Introduction 11/35
Criteria for evaluating subset regression models
• Two key aspects of the variable selection:
? Generating the subset models;
? Deciding if one subset is better than another.
Computational methods for variable selection
• (Coefficient of Multiple Determination R 2 ) A measure of the adequacy of a
regression model. Let Rp2 denote the coefficient of multiple determination for a
subset regression model with p terms, that is, p − 1 regressors and an intercept
term β 0
SSR (p ) SS (p )
Rp2 = = 1 − Res (4)
SST SST

K
? There are values of Rp2 for each value of p, and Rp2 increases as p increases
p−1
and is a maximum when p = K + 1;
? The analyst uses this criterion by adding regressors to the model up to the point
where an additional variable is not useful in that it provides only a small increase in Rp2

Introduction 12/35
• Since we cannot find an “optimum” value of R 2 for a subset regression model,
we must look for a “satisfactory” value.
?
R02 = 1 − (1 − RK2 +1 )(1 + dα,n,K ), (5)
KF
α,K ,n−K −1
where dα,n,K = n −K −1 and RK2 +1 is the value of R 2 for the full model.
? Any subset of regressor variables producing an R 2 greater than R02 is called an
R 2 -adequate (α) subset.
• (Adjusted R 2 ) The adjusted R 2 statistic, defined for a p-term equation as
n−1

2
RAdj,p = 1− (1 − Rp2 ) (6)
n−p

2
? RAdj,p statistic does not necessarily increase as additional regressors are introduced
into the model.
2 2
? In fact, if s regressors are added to the model, RAdj,p +s
will exceed RAdj,p if and only if
the partial F statistic for testing the significance of the s additional regressors exceeds
1;
? Consequently, one criterion for selection of an optimum subset model is to choose the
2
model that has a maximum RAdj,p .

Introduction 13/35
• (Residual mean square) The residual mean square for a subset regression model
SSRes (p )
MSRes (p ) = (7)
n−p

? The general behavior of MSRes (p ) as p increases is illustrated in the following figure.

MSRes (p ) initially decreases, then stabilizes, and eventually may increase.

(Remark: the eventual increase in MSRes (p ) occurs when the reduction in SSRes (p ) from adding a regressor to the
model is not sufficient to compensate for the loss of one degress of freedom in the denominator of (7).)
2
? The subset regression model that minimizes MSRes (p ) will also maximize RAdj,p .

2 n−1 n − 1 SSRes (p ) MSRes (p )

RAdj,p = 1− (1 − Rp2 ) = 1 − = 1−
n−p n − p SST SST /(n − 1)

Thus, the criteria minimum MSRes (p ) and maximum adjusted R 2 are equivalent.

Introduction 14/35
• (Mallow’s Cp Statistic)
SSRes (M)
Cp (M) = − n + 2 · p (M) (8)
σ2
b

σ2 = SSRes (F )/dfF is the “best” estimate of σ2 , we have (use the fullest

? b
model).
? SSRes (M) = kY − YbM k2 is the SSRes of the model M.
? p (M) is the number of predictors in M, or the degrees of freedom used up
by the model.
? Based on an estimate of
n n
1 1
σ2 ∑ E((Yi − E(Yi ))2 ) = σ2 ∑ E((Yi − Ybi )2 ) + Var(Ybi ).
i =1 i =1

Introduction 15/35
• (AIC & BIC)
? Mallow’s Cp is (almost) a special case of Akaike Information Criterion(AIC)
AIC (M) = −2logL(M) + 2 · p (M).
? L(M) is the likelihood function of the parameters in model M evaluated
at the MLE (Maximum Likelihood Estimators).
? Schwarz’s Bayesian Information Criterion (BIC)
BIC (M) = −2logL(M) + p (M) · log n.

Introduction 16/35
Search strategies

• “Best subset”: search all possible models and take the one with highest Ra2 or
lowest Cp .
• Stepwise (forward, backward or both): useful when the number of predictors is
large. Choose an initial model and be “greedy”.
• “Greedy” means always take the biggest jump (up or down) in your selected
criterion.

Introduction 17/35
Implementations in R

• “Best subset”: use the function leaps. Works only for multiple linear regression
models.
• Stepwise: use the function step. Works for any model with Akaike Information
Criterion (AIC). In multiple linear regression, AIC is (almost) a linear function of
Cp .

Introduction 18/35
Introduction

Computational Techniques for Variables Selection

All possible regression
Stepwise Regression Methods

Strategy for variable selection and model building

Computational Techniques for Variables Selection 19/35

Selection for variables

• To find the subset of variables to use in the final equation, it is natural to

consider fitting models with various combinations of the candidate regressors.
• Computational techniques for generating subset regression models
? All possible regression
? Stepwise regression methods
I Forward selection;
I Backward elimination;
I Stepwise regression.

Computational Techniques for Variables Selection 20/35

All possible regression
• This procedure requires that the analyst fit all the regression equations involving
one candidate regressor, two candidate regressors, and so on.
• If there are K candidate regressors, there are 2K total equations to be estimated
and examined. Clearly the number of equations to be examined increases rapidly
as the number of candidate regressors increases.
• Example 1: The Hald Cement Data

Computational Techniques for Variables Selection 21/35

• Clearly the least squares in estimate of an individual regression coefficient
depends heavily on the other regressors in the model.
• The large changes in the regression coefficients observed in the Hald cement
data are consistent with a serous problem with multicollinearity.

Computational Techniques for Variables Selection 22/35

Rp2 criterion
• From examining this display it is clear that
after two regressors are in the model, there
is little to be gained in terms of R 2 by
introducing additional variables;
• Both of the two regressor models (x1 , x2 )
and (x1 , x4 ) have essentially the same R 2
values, and in terms of this criterion, it
would make little difference which model is
selected as the final regression equation;
• If we take α = 0.05,
4F0.04,4,8

R02 = 1 − (1 − R52 ) 1 +
8

4(3.84)
= 1 − (1 − 0.98238) 1 +
8
= 0.94855
Therefore, any subset regression model for
which Rp2 > R02 = 0.94855 is adequate
(0.05), that is, its R 2 is not significantly
different from RK2 +1 .

Computational Techniques for Variables Selection 23/35

Simple correlations

It is instructive to examine the pairwise correlation between xi and xj and between xi

and y .
little use if multicollinearity
• The pairs of regressor (x1 , x3 ) and (x2 , x4 ) are highly correlated;
• Consequently, adding further regressors when x1 and x2 or when x1 and x4 are
already in the model will be of little use since the information content in the
excluded regressors is essentially present in the regressors that are in the model.
• This correlative structure is partially responsible for the large changes in the
regression coefficients.

Computational Techniques for Variables Selection 24/35

MSRes (p ) vs. p
• The minimum residual mean square model
is (x1 , x2 , x4 ), with MSRes (4) = 5.3303
2
(RAdj = 0.97645);
• As expected, the model that minimizes
MSRes (p ) also maximizes the adjusted R 2 ;

• However, two of the other three-regressor

models [(x1 , x2 , x3 ) and (x1 , x3 , x4 )] and
the two-regressor models [(x1 , x2 ) and
(x1 , x4 )] have comparable values of the
residual mean square.
1. If either (x1 , x2 ) or (x1 , x4 ) is in the
model, there is little reduction in
residual mean square by adding
further regressors.
2. The subset model (x1 , x2 ) may be
more appropriate than (x1 , x4 )
because it has a smaller value of the
residual mean square.

Computational Techniques for Variables Selection 25/35

Cp (p ) vs. p
• Suppose we take σ̂2 = 5.9829 (MSRes from
the full model), then
SSRes (3)
C3 = − n + 2p
σ̂2
74.7621
= − 13 + 2(3) = 5.50
5.9829
• From examination of this plot we find that
there are four models that could be
acceptable: (x1 , x2 ), (x1 , x2 , x3 ),
(x1 , x2 , x4 ), and (x1 , x3 , x4 );
• Without considering additional factors
such as technical information about the
regressors or the costs of data collection, it
may be appropriate to choose the simplest
model (x1 , x2 ) as the final model because
it has the smallest Cp .

Computational Techniques for Variables Selection 26/35

• This example has illustrated the computational procedure associated with model
building with all possible regressions.
• Note that there is no clear-cut choice of the best regression equation.
• Very often we find that different criteria suggest different equations. For
example, the minimum Cp equation is (x1 , x2 ) and the minimum MSRes equation
is (x1 , x2 , x4 ).
• All “final” candidate models should be subjected to the usual tests for adequacy,
including investigation of leverage points, influence, and multicollinearity.

? This table examines the two models

(x1 , x2 ) and (x1 , x2 , x4 ) with respect to
PRESS and their variance inflation factors
(VIFs).
? Both models have very similar values of
PRESS (roughly twice the residual sum of
squares for the minimum MSRes equation),
and the R 2 for prediction computed from
PRESS is similar for both models.
? However, x2 and x4 are highly
multicollinear, as evidenced by the larger
variance inflation factors in (x1 , x2 , x4 ).

? Since both models have equivalent PRESS

statistics, we would recommend the model
with (x1 , x2 ) based on the lack of
multicollinearity in this model.

Computational Techniques for Variables Selection 27/35

Selection of variables: stepwise-type procedures

• Cases: when there are a large number of potential explanatory variables ( #=q);
not involve computing of all possible equations (2q ).
• Common feature: the variables are introduced or deleted from the equation one
at a time; exam only a subset of all possible equations (evaluate at most q + 1
equations)
• Procedure Categories:
? Forward selection procedure
I goes through the full set of variables and provides with q possible equations.
? Backward elimination procedure
I involves fitting at most q regression equations
? Stepwise regression (a modification of the FS procedure)
I a number of possible combinations of the afore two procedures.

Computational Techniques for Variables Selection 28/35

Forward selection procedure

• Starts with an equation containing only a constant term but no regressors.

• Step 1: The first variable x1 included in the equation: the one which has the
highest simple correlation (with y ).
keep ? Retain it when β 1 is significantly different from zero;
? Search for a second variable.

• Step 2: The second variable is the one which has the highest correlation with y ,
after y has been adjusted for the effect of the first variable.
? i.e. the variable has the highest simple correlation coefficient with the residuals from
step 1;
? Retain x2 when β 2 is significantly different from zero;
? Search for a second variable.
• ···
• Terminate the procedure: insignificant β q .
? Judged by the standard t-statistic computed from the latest equation;
? Mostly by a low t cutoff value for testing the coefficient of the newly entered variable.

Computational Techniques for Variables Selection 29/35

Backward elimination procedure
• Starts with the full equation and successively drops one variable at a time;
The variables are dropped on the basis of their contribution to the reduction of
SSres .
• Step 1: The first variable x1 deleted from the equation: the one which
contributes the smallest to the reduction of SSres .
? i.e. the variable has the smallest t ratio (the ratio of the regression coefficient to the
standard error of the coefficient).
? If all the t ratios are significant, all q regressors will be retained in the equation.
? If there are one or more variables with insignificant t ratios, drop the variables with
the smallest insignificant t ratios
• Step 2: The equation with the remaining q − 1 variables is fitted and the t ratios
for the new regression coefficients are examined.
• ···
• Terminate the procedure: when all the t ratios are significant or all but one
variable has been deleted.
? Judged by the standard t-statistic computed from the latest equation.
? Mostly by a high t cutoff value so that the procedure runs through the whole set of
variables.

Computational Techniques for Variables Selection 30/35

Stepwise method

• Essentially a forward selection procedure + proviso that at each stage the

possibility of deleting a variable is considered.
? A variable that entered in the earlier stages of selection may be eliminated
at later stages.
? The calculation made for inclusion and deletion of variables are the same
as the afore two methods.
? Requires two cutoff values, one for entering and one for removing variables.
I Frequently we choose t
IN
> tOUT , making it relatively more difficult to
add a regressor than to delete one.
? Often different levels of significance are assumed for inclusion and exclusion
of variables from the equation.
? Caution: the order in which the variables enter or leave the equation
should not be interpreted as reflecting the relative importance of the
variables. (intercorrelation affects!)

Computational Techniques for Variables Selection 31/35

Comparison and comments
forward selection

• FS and BE work in the opposite direction.

• A stopping rule: In FS, stop if minimum t ratio is less than 1; In BE, stop if
minimum t ratio is greater than 1.
• Partial F -statistics is alternative of t-statistics since tα2/2,ν = Fα,1,ν .
• BE is particularly favored by analysts who like to see the effect of including all
the candidate regressors, so that nothing obvious will be missed.
• All three work out nearly the same selection of variables with noncollinear data.
But they do not necessarily lead to the same choice of the final model.
• FE: once a regressor has been added, it can not be removed at a later step.
• BE is better able to handle multicollinearity than FS because it is often less
adversely affected by the correlative structure of the regressors than is FS.
• NONE generally guarantees that the best subset regression model of any size will
be identified.

Computational Techniques for Variables Selection 32/35

Introduction

Computational Techniques for Variables Selection

Strategy for variable selection and model building

Strategy for variable selection and model building 33/35

Strategy for variable selection and model building
Fit the full model

Perform
residual analysis

Do we need Yes Transform

a transformation? data

Perform all
possible regressions

No
Select models
for further analysis

Make
recommendations
Strategy for variable selection and model building 34/35
Variable selection for high-dim data

When high-dimensional or ultrahigh dimensional data is appeared in regression, i.e.

p >> n, ??
• LASSO
• SCAD
• Elastic-net
• MCP
• ···
• SIS

Strategy for variable selection and model building 35/35

Samenvatting Boek Illustrated History of The United States of America Summary Usa
0% (1)
Samenvatting Boek Illustrated History of The United States of America Summary Usa
11 pages
MATH6183 Introduction+Regression
No ratings yet
MATH6183 Introduction+Regression
70 pages
Econometric Modeling:: Model Specification and Diagnostic Testing
100% (1)
Econometric Modeling:: Model Specification and Diagnostic Testing
57 pages
Cash & Cash Equivalents
No ratings yet
Cash & Cash Equivalents
4 pages
Embedded C Program
No ratings yet
Embedded C Program
10 pages
Arabic Language Interview Questions and Answers 22712
No ratings yet
Arabic Language Interview Questions and Answers 22712
14 pages
WINSEM2023-24 MAT6015 ETH VL2023240501308 2024-03-19 Reference-Material-I
No ratings yet
WINSEM2023-24 MAT6015 ETH VL2023240501308 2024-03-19 Reference-Material-I
39 pages
Reg07
No ratings yet
Reg07
22 pages
Unit 4
No ratings yet
Unit 4
7 pages
Lecture 5 Model Selection I: STAT 441: Statistical Methods For Learning and Data Mining
No ratings yet
Lecture 5 Model Selection I: STAT 441: Statistical Methods For Learning and Data Mining
17 pages
Module07 - Model Selection and Regularization
No ratings yet
Module07 - Model Selection and Regularization
46 pages
Model Selection-Handout PDF
No ratings yet
Model Selection-Handout PDF
57 pages
Chapter 2
No ratings yet
Chapter 2
37 pages
Chapter 4
No ratings yet
Chapter 4
23 pages
Diagnostic Tests2
No ratings yet
Diagnostic Tests2
25 pages
Model Selection
No ratings yet
Model Selection
11 pages
Rio Thesis _054559
No ratings yet
Rio Thesis _054559
53 pages
STA302 Week12 Full
No ratings yet
STA302 Week12 Full
30 pages
Week8_Lecture_1_ML_SPR25
No ratings yet
Week8_Lecture_1_ML_SPR25
20 pages
Econometrics Specification Data Issues
No ratings yet
Econometrics Specification Data Issues
22 pages
Jurnal Asli Diagram Sa
No ratings yet
Jurnal Asli Diagram Sa
11 pages
11. VariableSelectionAndModelBuilding IIT
No ratings yet
11. VariableSelectionAndModelBuilding IIT
22 pages
Econometrics Chapter 3
No ratings yet
Econometrics Chapter 3
24 pages
L2D-Multiple Regression D 2022-03-03 21_20_03
No ratings yet
L2D-Multiple Regression D 2022-03-03 21_20_03
31 pages
13 Paper PDF
No ratings yet
13 Paper PDF
14 pages
SRM Notes
No ratings yet
SRM Notes
38 pages
SAS Code To Select The Best Multiple Linear Regression Model For Multivariate Data Using Information Criteria
No ratings yet
SAS Code To Select The Best Multiple Linear Regression Model For Multivariate Data Using Information Criteria
6 pages
AN ASYMPTOTIC THEORY FOR LINEAR MODEL SELECTION
No ratings yet
AN ASYMPTOTIC THEORY FOR LINEAR MODEL SELECTION
44 pages
Chapter 6 Variable Selection and Model Building
No ratings yet
Chapter 6 Variable Selection and Model Building
32 pages
Multiple Linear Regression-Part V Exhaustive Search: Dr. Gaurav Dixit
No ratings yet
Multiple Linear Regression-Part V Exhaustive Search: Dr. Gaurav Dixit
10 pages
0 Regularization PDF
No ratings yet
0 Regularization PDF
88 pages
A Universal Selection Method in Linear Regression Models: Eckhard Liebscher
No ratings yet
A Universal Selection Method in Linear Regression Models: Eckhard Liebscher
10 pages
Chapter 5
No ratings yet
Chapter 5
30 pages
Linear-Model-Selection-and-Regularization
No ratings yet
Linear-Model-Selection-and-Regularization
23 pages
lecture 3
No ratings yet
lecture 3
33 pages
Lesson 5 Model Selection
No ratings yet
Lesson 5 Model Selection
41 pages
3rd Module EDBA Contiuation1
No ratings yet
3rd Module EDBA Contiuation1
6 pages
Multiple Linear Regression & Nonlinear Regression Models
No ratings yet
Multiple Linear Regression & Nonlinear Regression Models
51 pages
Chapter 06 Linear Reg
No ratings yet
Chapter 06 Linear Reg
24 pages
Lars Based S Estimator
No ratings yet
Lars Based S Estimator
10 pages
Lecture 09_02.09.2024_Regression-01
No ratings yet
Lecture 09_02.09.2024_Regression-01
62 pages
Chapter 9: Selection of Variables
No ratings yet
Chapter 9: Selection of Variables
30 pages
Module01 LinearRegression
No ratings yet
Module01 LinearRegression
41 pages
02 - Using Advanced Regression
No ratings yet
02 - Using Advanced Regression
5 pages
Regression
No ratings yet
Regression
21 pages
Stat 136 Chapter 6 Variable Selection and Comparison of Regression Coefficients
No ratings yet
Stat 136 Chapter 6 Variable Selection and Comparison of Regression Coefficients
40 pages
BA501 Week5 Linear Regression
No ratings yet
BA501 Week5 Linear Regression
45 pages
Regression SPSS
No ratings yet
Regression SPSS
21 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
55 pages
Slides 2
No ratings yet
Slides 2
27 pages
analysis_regression_backward_stepwise_elimination_regression_model (1)
No ratings yet
analysis_regression_backward_stepwise_elimination_regression_model (1)
2 pages
31 Lecture Slides 29 and 30
No ratings yet
31 Lecture Slides 29 and 30
15 pages
Statistical Modelling: Regression: Choosing The Independent Variables
No ratings yet
Statistical Modelling: Regression: Choosing The Independent Variables
14 pages
DDMA05_ModelSelection
No ratings yet
DDMA05_ModelSelection
28 pages
Chapter three
No ratings yet
Chapter three
35 pages
Specification Choosing Independent Variables
No ratings yet
Specification Choosing Independent Variables
7 pages
A Novel Bayesian Approach For Variable Selection in Linear Regression Models
No ratings yet
A Novel Bayesian Approach For Variable Selection in Linear Regression Models
24 pages
Model Selection
No ratings yet
Model Selection
7 pages
Chapter 8: Multiple and Logistic Regression Learning Objectives
No ratings yet
Chapter 8: Multiple and Logistic Regression Learning Objectives
3 pages
Lecture 3.1
No ratings yet
Lecture 3.1
21 pages
Fundamental Math
From Everand
Fundamental Math
Russell Pead
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Worked Examples in Advanced Mechanics of Materials using MATLAB
From Everand
Worked Examples in Advanced Mechanics of Materials using MATLAB
Eric Okoth Ogur
No ratings yet
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Asha Workers - Knowledge and Practice
No ratings yet
Asha Workers - Knowledge and Practice
17 pages
Layered Architecture and SOA
No ratings yet
Layered Architecture and SOA
53 pages
Sis Grade 1 Class Program 25 26
No ratings yet
Sis Grade 1 Class Program 25 26
8 pages
Minutes of Meeting: Padmashree Dr. D. Y. Patil College of Nursing, Pimpri, Pune-411018
No ratings yet
Minutes of Meeting: Padmashree Dr. D. Y. Patil College of Nursing, Pimpri, Pune-411018
3 pages
AKANKSHA RESUME(22) (1)
No ratings yet
AKANKSHA RESUME(22) (1)
3 pages
DBMS Lesson Plan v1
No ratings yet
DBMS Lesson Plan v1
12 pages
COT 2-Cookeryg7
100% (1)
COT 2-Cookeryg7
5 pages
The FORCE Companion: Quick Tips and Tricks (Force Drawing Series) 1st Edition Mattesi pdf download
No ratings yet
The FORCE Companion: Quick Tips and Tricks (Force Drawing Series) 1st Edition Mattesi pdf download
63 pages
Interim Rates For Wholesale Residential and Business High Speed Access Services
No ratings yet
Interim Rates For Wholesale Residential and Business High Speed Access Services
4 pages
NAND-NOR Implementation
No ratings yet
NAND-NOR Implementation
12 pages
F5 Chemistry Lab - Qualitative Analysis of Unknown Sample X 2024
No ratings yet
F5 Chemistry Lab - Qualitative Analysis of Unknown Sample X 2024
1 page
Company Profile Leads
No ratings yet
Company Profile Leads
17 pages
Emotas Canopen Stack User Manual English
No ratings yet
Emotas Canopen Stack User Manual English
80 pages
Bus Ticket Reservation System Synopsis
0% (1)
Bus Ticket Reservation System Synopsis
19 pages
Ed Sheeran E-Tickets
No ratings yet
Ed Sheeran E-Tickets
3 pages
Tutorials For TRI and QGIS
No ratings yet
Tutorials For TRI and QGIS
24 pages
Green Cycle Startup
No ratings yet
Green Cycle Startup
14 pages
Performa Invoice: State Name: Haryana, Code: 06
No ratings yet
Performa Invoice: State Name: Haryana, Code: 06
2 pages
DONABY Joseph Affidavit
No ratings yet
DONABY Joseph Affidavit
7 pages
PM Wani Applicant
No ratings yet
PM Wani Applicant
1 page
7 - Steering System
No ratings yet
7 - Steering System
18 pages
INC 12-Systematic Review Nurul Arifah
No ratings yet
INC 12-Systematic Review Nurul Arifah
8 pages
Environmental Modifications: Nakul Ranga
No ratings yet
Environmental Modifications: Nakul Ranga
42 pages
Semi Reconfigurable OADM Node Design For
No ratings yet
Semi Reconfigurable OADM Node Design For
11 pages
The BMW Carburetor: - :::::: : I:::::: :: !: J L I I: : I
No ratings yet
The BMW Carburetor: - :::::: : I:::::: :: !: J L I I: : I
32 pages
Rdad 005
No ratings yet
Rdad 005
28 pages

Ch5 Slide VariableSelection

Uploaded by

Ch5 Slide VariableSelection

Uploaded by

AMA3602

Applied Linear Models

Variable Selection in Regression Equations

Computational Techniques for Variables Selection

Strategy for variable selection and model building

Computational Techniques for Variables Selection

Strategy for variable selection and model building

• When we have many predictors (with many possible interactions), it can be

• To “implement” this, we need:

• R 2 : not a good criterion. Always increase with model size → “optimum” is to

• Previously our concern: functional specification correct or not; underlying

? The least-squares of β is β̂∗ = (X 0 X )−1 X 0 y

? The general behavior of MSRes (p ) as p increases is illustrated in the following figure.

2 n−1 n − 1 SSRes (p ) MSRes (p )

σ2 = SSRes (F )/dfF is the “best” estimate of σ2 , we have (use the fullest

Computational Techniques for Variables Selection

Strategy for variable selection and model building

Computational Techniques for Variables Selection 19/35

• To find the subset of variables to use in the final equation, it is natural to

Computational Techniques for Variables Selection 20/35

Computational Techniques for Variables Selection 21/35

Computational Techniques for Variables Selection 22/35

Computational Techniques for Variables Selection 23/35

It is instructive to examine the pairwise correlation between xi and xj and between xi

Computational Techniques for Variables Selection 24/35

• However, two of the other three-regressor

Computational Techniques for Variables Selection 25/35

Computational Techniques for Variables Selection 26/35

? This table examines the two models

? Since both models have equivalent PRESS

Computational Techniques for Variables Selection 27/35

Computational Techniques for Variables Selection 28/35

• Starts with an equation containing only a constant term but no regressors.

Computational Techniques for Variables Selection 29/35

Computational Techniques for Variables Selection 30/35

• Essentially a forward selection procedure + proviso that at each stage the

Computational Techniques for Variables Selection 31/35

• FS and BE work in the opposite direction.

Computational Techniques for Variables Selection 32/35

Computational Techniques for Variables Selection

Strategy for variable selection and model building

Strategy for variable selection and model building 33/35

Do we need Yes Transform

When high-dimensional or ultrahigh dimensional data is appeared in regression, i.e.

Strategy for variable selection and model building 35/35

You might also like