0% found this document useful (0 votes)

3 views11 pages

Unit 1 Advanced Algorithm

The document discusses model selection and feature engineering in machine learning, emphasizing the importance of choosing the right model for specific data. It covers techniques such as cross-validation and bootstrap sampling to evaluate model performance and prevent overfitting. Additionally, it highlights the significance of feature extraction, scaling, and selection in improving model accuracy.

Uploaded by

priyankasakpal72

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views11 pages

Unit 1 Advanced Algorithm

Uploaded by

priyankasakpal72

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

1...

Model Selection and

•
Feature Engineering
I~-::-:-- - - - - - - - - --]_
Chapter Outcomes...
\Iler reatling th1\ chapter. ,1udcnt~ will be able to understand :
• ln1m<luc11on of ~election of a mode.
• . ,eature~.
Tr,11n1ne' n model for super\ ·,sed learnmg r

I• . ·
h:aturc extraction 'and cngincenng · on numerical . and text data.
. data, cmegoncal
• The concept of feature scaling. feature selection.

I Learning Objectives ...

]
Select a suitable model for the given data with juMification.
Dc~cribe lhe process of using supervised learning on the given data.
00 De~cribe the process of feature extraction and engineering on the given data.
Compare feaiure engineering for the gi1en type of data.
[i] r t
Select ,ca ure ~ca1·mg. ,eaturc
r
sc Iccuon, ·
· d'1111ens1onal11y ·
· ·m the given
· reduction snua 1 with J·ustification.
· t·on

ID INTRODUCTION: SELECTING A MODEL

What is Model Selection?
"The process of selecting the machine learning model most appropriate for a given issue is known as model
selection."
th
Model selection is a procedure that may be used to compare models of the same type that have been set-up wi various
model hyperparameters and models of other types.
Why Model Selection?
Model selection is a procedure used by statisticians to examine the relati ve merits of different predictive methods and
identify which one best ftts the observed data. Model evaluation with the data used for training is not accepted in data science
because ii easily generates overoptimistic and overfitted models.
You may have to check things like:
• Overfi11ing and underfitling
• Generalization error
• Validation for model selection
For certain algorithms. the best way lo reveal the problem's structure lo the learning algorithm is through specific data
preparation. The next logical step is 10 define model selection as the process of choosing amongst model development
work nows.
So, depending on your use case, you choose an ML model.
[1.1]
-
A IV~•I rf1 111u1 1111,rn~ In Al 111111 Ml IP
7
llo1 11 1.. I 111111 I 1111 II, I ~1111111111 '"" 11111, I I ,111111111,

1h, 1111111 t I 111<11!1 I 1 l111111, 11, 1d Ii 11,,11, 11lulil1 11,, !1 11 11 11 ti ,111, 1 I• f '""''' I 1~111 LI•
lh1111tll) \111111 oll1111,11 iil,1l•tt l111,11

II I I 'II 1111 I 111111 Ill I IIJ1>d, I

I It 11111,I,, _ I" 1l111111111, I
11 1 , ' "'"'' 1111 11 1,
·• " 1,11 '" i.11111 , 1,, 1111v ,11111,d, 1 (11111 ,111 , 11,,,,, ,,~,ii• 1, ,,., 1,1 ''" ,1,, 11d,111 ,111111,, f
I I I" 111 lh1l 11

111111111- 11 1111 1 l1lr 11, II ~11111 11111111, ,111 11111111,111 1y 1,,1 1,., , , 111 IIIIIIV''\ 1111 ,1 vid• ,, j,,,, 1,11111111 1111,1111 ,,., '•l/111111, llu
11
1 1111 11 1111111111
\ 1 1 N, 111,il N, '""" 1111,d, I ,.,,,ti
l,1111 1 w11 l1111111 11 111111 ; 1,!111, 11111 111 11111p,111 d1 111111v I 11111111 h
1
ln l chtlo Ill ~, 11 11111111111 ' ,11111 l111 l r 11'1 1111 1111 111111,11 111 111111\ 111r111 1 ,Ill 1·111111111/"' Jf '/' 11 " ,,,,,1,1,-,,, f/11 111,1,c
111 I I Ii Ill I• 11 il,11.1

N1111111 l1 ul 11&,111: '1'1111 111,1y 11~r '111pp, 111 V,·, 1, 11 M111 11111, 1',VMJ 111111~11, l"J•I• ,,11111 ,11111 ,f,,, 1w,111,,., , 11 y,,111 ,1 ,
41
I 111 11111111 ,ti
1111\1 111 ~1•h I I 11111111111 1111~1•111111 1111• lu~k'/
I l11~~ll1rnll1111 I 11~11 ~: \VM l11v 1~111 1ryir~,11111 , 11 1111 ,I,·, " ''" I 11 11·~
llt-1111•~~11111 h,~k~: I 1111·111 II j.1 11•-~11111 , lt1111d11111 I 1111·~1. l'11l y111111111d ll'Y,11'~1111111 1·1,
I l11 ~lr1l1111 111~1,~1 h 11 11,111~ 1 li,,11n111µ, l11n1111 111m l I f11 ~1•· 1111p
I lw i.-1 1111·• di 111111 111111 11111111' 1yp1 11111!1111 y1111 l1 11v1 ,111d 1l w 1:1•1~ y,111 d11, y11111r111y 11~1; ;1 v;11 w1y 111 111111!1 +•.
M11t11·I S1•it•1111111 I 1•c·l,11l1 1111·N:

Rn11dorn
ProbfllJIIIHllc R,m mphng
Trnln/Toel

Cros'l Validation II Bootstrap I

Ff~. l.J
lfr~11 111pll1111 Methods:
A, 11lc 11111111; i111pli1:h, rcs,11npling 111c1hocl~ arc s1raigh1forw:1rd mc1hod~ c,f rearranging d&ta sample\ to \CC how well the
11u,dd pcil11nns on sampli;s of' darn i1 ha~ 1101 been lr~inc<l. Rcsampling, in other word,, enables us tCJ determine Lhc model's
gc11i;1ul II ahi Iii y.

'11 11.:n: ,11 i; two main 1ypes of' re-1,a111pling 1echniqucs:

( 'ros,-vuliclal 1011:
ll " a n.:sampling procedure lo evalua1c modeb by splitting the data. Consider a \ituation where you have two modeh
and want 10 determine which one is 1hc rnoht appropriate for a certain i~,uc. Jn this ca-,e, we can u-,e a cross-va lidauon
JllllCC~S.

Su, let's say you arc working on an SVM model and have a data~t that iterates multiple time~. We will now divide the
tlalascts 11110 a few groups. One group out of the five will be used as te~t data. Machine learning modeh are evaluated on te~
dala .iftcr being trained <in training data.
1.(!t\ say you calculated the accuracy of each iteration; the figure below illu~Lrate\ the iteration and accuracy of that
i1era11on.
Advonced Algorlthm9 In Al i!lld ML Eng c ring
Model Selecllon and Featu~
------ 1.3

Test D rrain D
Accuracy
Dataset

Iteration 1
[ J 88%

lteratton 2 [ I ] 83%

Iteration 3
I I I I I ] 86%

Iteration 4
I I I I I J 82%

Iteration 5
I I I I ] 84%

Iteration 6
I I I I J 85%

Fig. 1.2: Cross-Validation Example ... c

N I , ·I· I . . . d 8-l 4'R0 You now u~c Lile sam
0\\ · ct 5 ca cu ate the mean accuracy or all the 1tera11ons. which comes to aroun · •
procedure once again fo r the logistic regression model.
y . h SYM So accordino 0
to accuracy. you
ou can now compare the mean accu racy or the logistic regression model with t e · ·
might claim that a certain model is better for a given use case.
th
To implement cross-valid ation you can use sklearn.model_selection.cross_va l_scorc, li ke is:
>>> from sklearn import datasets. linear_model
>>> from sklearn.model_selection import cross_ val_score
» > diabetes = data,ets.load_diabetes()
>» X =diabetes.data[: 150]
»> y = diabetes.targetI:150 I
>>> lasso= lincar_modcl. La,so()
>>> print(cross_ val_score(lasso, X, y, cv=3))
10.33 15057 0.08022103 0.035318161
Bootstrap: 11
Another sampling technique is called Bootstrap, and it involves replacing the data with random samples. is used to
sample a dataset using replacement 10 estimate statistics on a population.
• Used with smaller datasets.
• The number of samples must be chosen.
• Size of all samples and test data should be the same.
• The sample with the most scores is therefore taken into account.
In simple terms, you start by:
• Randomly selecting an observation.
• You note that value.
• You put that value back.
Now, you repeat the steps N times, where N is the number of observations in the initial dataset. So the fi nal result is the
one bootstrap sample with N observations.
Probabilistic Measures:
Information Criterion is a kind of probabilistic measure that can be used to evaluate the effecti veness of statistical
procedures. IL~ methods include a scoring system that selects the most effective candidate models using a log-likelihood
framework or Maximum Likelihood Estimation (MLE).
f
•
111 111111ilh1!! 11111}' 1,,11, ,_ "" ffl•Hfrl I'' 1l11trn~1111 .. 1,111 1 1111,l111MI, fl(
I" 11111 flt""! 1,1111 '111111,I• t it )

I 1111111111
11 lflf r If I I, 1I '"" fl ! (I II 1

1111pl 1 lily c,,I 111 11 lfi1f11f,,J ,,1 l!Ht•I w/111 1111 I jt I II ''f If
111111,1, 1, 1 ti, I , 1 1111 111 ,111,11 11,hl ,If, f , 1rnl!l• I I ,,,,,,,, 1 I I 1,4 ,
11111
11111 1 If! 1l1111 I.tit,,, 11 Ill ,1,, 1d ''" ,I ,,111111! Ill's U l'I ,,, ,,,,,,,1., 1•1 ,,, ,, , ,,

ti t, '

A~11 ll 1 1111111111ull1111 I 1111111111 IA II 1 ,,, 1/// 11~1J,' ,,, I/ti,' t1 I ,,,

1 11 111 11
I I I
111 I ,I '"'"' 1111111• ,,, ,If "'' '" 11 111,,y I, 11 ,I,,,"'
1 1• ,, h' ' "
I I 11 ,,,,,,,,,,,,, l11 ,,,,, I
,,, ,,,, ,,~ ,.,,~ 1IJ1/if ,.
, 1 ,, 111 ,q,, lt 1111,11 , 1 ~ 11 1,,111,1, ,,,, , ,111 y h• IJ1l11 v ,, II 1 111 1 ., ,fm. v.i ;,,.-1 , , ~ .•
11,•~f• 14 ,1 11/111 r /fl fJfll/JJ! il! I ,. ~
I flWI l A II 11111111•- 1111 fll tlr 1111Jl1 ~ fl I 111 111,,tr4 II~
I'' rt.rli f Ir I Ill 1111 11 ,.,,1, I ,1• 1 111 11 y
,,v. ,,
I l11f (f Ii
/\ II
1 11
f lti 11111111,, 1 ,,/ ,1,, 1"'' I v1111,1l1l1
11
1
11 111 11 ' '
V.
I lt1 1111,,!1 I • JIii ,111' ' 1111·l1l11•111 f
,, I• 1, '"' ,,, 1f,1• 1,1w 11I ///,Ill 1fol;it£U/
, 4 ,1 ,. 111 111 ii,,. , i r1•~w·1 ,,, II, ' I
~ I 1, 1111' 11111111!1 I 111,1,,1 11 Ill I I"''" I ·I ,,~ ,. / I f,1'/1110 ,,,,,,, ill~ 1111_1• h fly,,1 f~ifl; ,,,,,.
• 11 ,, 111,,1 ,, ~,111J•Jd1·~ w,1 1, v•·w 111l11111y w111, , .
11 II f IIIIW I''" I " I "
I II Ii I~ 1r111d1I ',l1i1 l111Vf' i1 pi,,,, I11
llllllllllf' d 11i,1 I hr, 1111plll' 111111 111I ,, . Irr ,,, ,

Ml11l111111111J,·,c 1l111l111 • 1~'""'" (M Iii ,): 1,,

1 1 111111
1J:,w v,,,.,,,,•~\l•,11 n tk i,..,,, ;,_11,.,1 ii l l;;,fJ
, 11 • Mi ll 10111 1111 11t, 1'~1'11111 •1111111111111 ,,ll,,w, "'
/\,,''""'fl
I " ,1 I 11ml /111111~ 111•· , •1rr11·r ,,,,,,. ,,f ~,.,,, ,,, :1I """'"""'l!. Jl'At!e11,
111111'1 111111 111 11h~r1 vril d:1111 ',1111ply 1,111, II ,~ II 11•1 11111111•
l l'I ttj' ll il lllll 111111 111111 iillll' li'HIIIII IJ.'

Mill , 1.(111 It,(',:)

M11rlr l, I) '1111 11111/ld'- pwd11111111\

I.(hJ ,~ lhl' 111 ,111111'1 111

"
h,1 ~ 11l'rdcd ,,, ~x p11:¼~ 1111' 11111dcl.

(IJ)
1
1111111 brr 1111111~ ,wrdf'd 111 d1·\c11l11; 1'1 • 1111,rli,I', 1711· l1t l1 111\
1 1

I, 11

B11,v1·,h111 l11for11mlf1111 ( 'dH•1·l1111 /Ill( 'J:

Ill(' W/1\ dl'ti wd 1141111, Ilic 11 11 y1-~11111 p11ib:illil11y 11lr11 1111d j\ 11ppropr1:r1,.; fr,1 111111kb 11,:,1 w,e 1n:m11HJl1t lih:lif~~A
l''<llllllllillll tlt1 ri llj! 11 11111111!'·
I
Ill(' IJl(11kJ 2/11(1,J

whL'IC. I, is 1he 11111xi11111cd vul11t· ril 1111.; likelil101rrl li111t:1i1111 ol !he 11111tlcl.
11 i\ Ille 11 u111hcr ol d111 11 p11i111~.
k i~11,c 111111,hcr ol free p111·111,1c1e1·~ 10 he c~li111u1cd.
111(' i\ 11 Hirc w 111111011ly c111ploycd i11 1i111c ~cticH 1111d Ji11c111 n:grc~~i1111 11111dt:lk. J 11,wcvt:r, 11 1n;,y he ,,ppl1cd bm,1dly for
1111y 111odc.:I, hu~L"d 111111111xi 111t1111 proh:1hi li1y.
Strud urul Risk Ml11l111lwl1011 (SHM):
Thc1'l' 111 c i11,11111cc~ ol ovcrl'i11 i11!l when Ille 111odcl hcc<n11cH hia~cd l(,wanJ the 1rnin111g datu, which "it~ prim:11y •,ouru:
or lcnrn ing.
/\ gcnl'rul11cd 111odcl ,nu,1 fn.:quc111ly be chosen fro111 a lilTlil cd dal:1 ~cl in tt1ach1r1c learning, wh ,t:h lcc1di, to lhc muc of
ovcrliu ing whcn 1he 111odcl beco111c~ too liued 10 1hc Np<:cilh;~ or lhc t1aini11g ~cl w1d pcrfm 111; poorly on new data. Hy
Wl'ighi11g 1hc 111odc.:l'N c11n1plcx i1 y 11g11 in,1 how well ii fits 1he 1r,ii11ing d~III. the Sl<M print:i plt: ~olvc\ thi'i i,~uc.
N
l{.,,,1 (1) - N' 1: l,(y 1, l'(x 1)J I AJ(IJ
I
I Im·, J(I) iNthe n1111pkxi1y or 1hc model.
Advanced Algorithms '" Al and ML ---------~1-.:.S_______~M~od~e~l_:Se~l~ec:.!t.:::lo:.:,:n_::o::,:nd~Fe::a:.;.;lu;.:.r~e~E...ng"'"t_oo_o_rl_n-n
\ktm, for r, aluating Rcgrt~\lon \lode!,:
\h..J.:I c,.1lua111,n 1, rnK1al m machine lc,1ming. It ,1mphf1c, prescnlinf rnur model to other, and helps you umlt:r~l,i nd
h•"' 11.:II II perform, Sciera! c1aluullon metric, arc J1,1ilablc. hut onl) a fe11 can he employed 1111h regression.
• \ ll'on \hsolutc Frror( \I \ E): The t-.1>\F acid, up each error's ahsolu1e \'alue It is an important metric toe, aluatc a
1mxkl "\ nu can ,1mpl~ calculate \1AC b~ 11nportinf
from ,kleam metric, impon mcan_absolutc_crror
• Mean Square Error (MSt): While MAE handles all errors equally, MSE 1s computed by adding the squares of the
real output and the C\pec1ed output. then di\'iding the result by 1he total number of data points. It provides an exact
number mtlicat1ng ho11 much your findi ng~ differ from what )OU projected.
from ,klearn.melflc, 1mpo11 mean_,quarcd_crror
• Adjusted R Square: R Square quuntifie, how much of the variation in the dependent variable 1he model can
account for. Its name, R Square. refers 10 the fact 1hm II is the square of the correlation coefficient (R).
When comparing machine learning modeb. )OU must cl1oose a 1001 or platform that can support your team's needs and
~ our bu,ines, goal.
W11h Cen,ius. )OU can monitor each model's health in one pl.ice. use the user-friendly interface 10 comprehend models
and anal) ze them for panicular problems.
• haluate performance wi1hou1 ground 1ru1h.
• Compare the past performance or a model.
• Create personali1ed da~hboards.
• Compare performance be11reen model iterations.
TRAINING A MODEL FOR SUPERVISED LEARNING FEATURES -
UNDERSTAND YOUR DATA BETTER, FEATURE EXTRACTION AND
ENGINEERING
Training a model for supervised learning involves several steps, and underslanding your data is a crucial part of this
process. Feature extraction and engineering are techniques 1ha1 help you represent your data in a way that is conductive to
learning for your machine learning model. Here's a step-by-step guide:
1. Understand Your Data:
(a) Exploratory Data Analysis (EDA):
• Examine the structure of your dataset.
• Check for missing values. outliers, and anomalies.
• Understand the distribution of your target variable.
(b) Statistical Summary:
• Use descriptil'e statistics to summarize key aspects of your data.
• Identify panems, trends, and relationships.
(c) Visualization:
• Create visualizations (histograms. scaller plots etc.) to gain insights.
• Identify potential correlations between features and the target variable.
2. Feature Extraction:
(a) Select Relevant Features:
• Identify features that are likely to have a significant impact on the target variable.
• Remove irrelevant or redundant features !hat may 1101 contribute to the model's perl'ormance.
(b) Handling Categorical Data:
• Encode categorical variables using techniques like one-hot encoding or label encoding.
(c) Feature Scaling:
• Standardize or normalize numerical features to ensure they are on a similar scale.
• This is crucial for algorithms sensitive to feature scales, such as gradient-based optimization methods.
Ath nrcd Algo1llhnts In Al Md Ml ~----~-.; ,;.:.____
16 Modnl S11ler.llon ond I Miura CnglnP.orlll
_____ -...!l

·' I I ,11111 I "' Ill 'II Ill)!:

( I l rrr Ill \ n1 1,-.11111 ,·,:
l\i.1 ~ 111" ktlut 1111gl111,1pt111r 1mp11r1.1111 p.111,·111, ,1r ll'l,1111111,hr p,
llllll'',

• I'" n. mpll 1 ,11.1.t d.,t,· l1'.1tur,·, lm111 .1 t1mr,1.11np.1 1,·,m· int,·111dH111 1,·rn1, m wmh,m l'\1qrng ll',llllll''
1h1 !'oh nnmial I 1•111un•, :
• lntn'<lu,, p1 11\n11m1,1l l\',11111r, 111 ,·.1ptu1,· non lull',111,·la11011, h1p,.
• I 111 111s1,1111 ,. ,qu,,r,· 111 ,uh,· c1·rta111 k.111111·, to arrn11111 lrn qu,1tlra111: or rnh,r pallcrn,
t \'I n,111,•11\ionnlil ~ R1•cl m·Iion:
• \ ,,. t1·d1111q111, h~,· P11nllp,II Cnmpnn1·nt \ n,llys,s (PC A) to rcdun: the cl11ncnsronal11} of the data,cr while
11·t 11111n~ ,·,wnt,.!l 111h1r111,11111n
-1. Until ~pllttin)!:
tnl I rmninJ! nnd r,•, tin)! SN,:
• Spl11 111ur dat,N·t 1111n tr,1111111!! and 1l·,1111g M:L, ro c1.1luu1c your model\ pcrfomiancc on 1111\ccn duta
Mocll'I t'rn inint-::
111\ Choow n \lodl'l :
•
'1,·k,·, •' ,u11,,hk al!!11n1h111 h<1, cd 011 }Our problem (For c, amplc, rcgrcs~ion. cla\sificarion) nnd darn
hnr.11·1c11 ,11r,
1

(hl Trnin lht· \l ndl'I :

• h ·,·d th,· 1ra111111t' d,ua into the dw,cn modd .
• \d111,1 1110,kl parumcrcr, 11,111g 1cch n1quc, li ~c cros,-validn tion.
6. i\ locM E, 11t1111t i1111:
(n l E, nl1111tc on TC!lt Sl'I:

• A~,l·,, lhe 1nodcl'~ pcrforntuncc on th-.! tc~ting set to c~timmc it~ gcncrali1ation capability.
(hl Finc-tuuing:
• If n1·cdcd. linc-1u11c hypcq1arnmc1cr, to improve performance.
7. ll t ruth c Proctss:
(n) Rl'finl'mcnt :
• Bu,cd on model pcrl'o1lllillKC. go bac~ 10 feature engineering 01adjust rhe model architecture.
(bl Cross-V11liclntio11 :
• Pcri'or111 cro~, -, alldat1011 10 cn~urc robustncs~of your model.
8. Ucploymcnt:
• Once ,at1sfii.:d w11h the model. deplo) 11 to make pn:dictions on nc". unseen data.
FEATURE ENGINEERING ON - NUMERICAL DATA AND CATEGORICAL
DATA AND TEXT DATA
What is Feature Engineering ?

[ffl
~....: 1~· *c:::::J
natm:- -----. (QQ[ ---. * c:.:J _ .

~) / Raw data Features Insights

lif\:HHHi!
Fig. 1.3
englnoorlng
lldv11nccd /llgor11hm11 In Ill ond Ml Modl'I Selecllon and Feature -
1.7

f 1·11 1111 I I 1111111·1·1i111::

• I f tl . 1110s1 rclcv,,nr
,•n1111H·i:1111g rrlc" lo lhl' (H Ol'L" ol u,111i• domain ~111n1kdrc to ,clcl·t and trnn, on11 , ~
' ,1111L 11·
1,1r1,1hh, lto111 1111 1I I I • . ,r ,t 11~1,ca
1 l 1nodl' ing.
• ,111" 1m lllat111~ ,I p11:dKt1vc 111odcl 11\lng 111ach111c c.irn,ng ( •
1 ( l\,1I ·,lgorithrn,.
1
• 1h,· !!11•11 nl h ,lf111,· rng111t·,·r11111 ,111d ,dl'll 111n 1, 10 1111p1mt· 1hc pcdnnnancc of111,,chinc k.iming •
1);11 a l'n 1'111n·"lng:
• 1),,1., p1 rp1 111L'\Sl ll ~ i' .Ill llllflOll,1111\ll'fl111 1111; dala 1111111ng prnt·c,,
• II 1l'f1· " In ih,· 1 k,1111ng. 11.111, lnrming 1111d 11t1cg1a1ing ol dmu 10 111akc ii ready for analy,is. - . d· w
. ble for ,he , peed ic a
• 1hl' !! 0 ,il nl d.,1.1 p1,·pmi.:c"111g i, 10 1111p1 ovc the q11ali1y of 1hc dala and ICI make11 more ~ulla
11111111,g '"'~
h'uton· t•ugincning ll'l'hni1111c~ for n11111cricaf dnt,1, categorical datu. and text data separately:
f. N111111•rirnl Dain:
(n) Scnfing:
. · 1· r sc·1lc This is i mporwnl for
S1andmdi1t• 01 normali,c numerical fcawres 10 ensure they arc on a s1m1•1 • ·
algo1Hl1111, \l.!11,i1ivc 10 fca1ure ~cnle~.
(h) Uinning: .
. . hcl cap1urc non-linear
• Con,·c11 numrncal feature, in10 calcgorical feature~ by binning or buckc11ng. Th,~ can P
rcl.111on,f11p,.
(c) Polynomial Fcnlurcs:
• f111rod11cc polynomial fci1turc~ 10 captu re non-linear relationships in 1he da1a.
(d) Lui: Transform:
• /\ppfy a log 1rnn,for111mion 10 numerical lcalllre~ 10 handle skewed dis1ributions.
(c) lnlcractions:
• Crcalc 1111eraction lcrm~ between 1wo or more numerical feawre~.
(f) Outlier llandling:
• ltlcn1ify and handfi.: outl iers using 1cchniques such as 1runca1ion. transfonna1ion, or imputation.
2. Calcgoricaf Dala:
(a) One-llol Encoding:
• Convert ca1egorical variables in10 binary vcc1ors using one-hot encoding.
(b) Label Encoding:
• Tran~form categorical labels in10 numerical values if the ordi nal relationship is essential.
(c) Target Encoding:
• Encotlc categorical features based on the mean or median of the 1arge1 variable for each category.
(d) FrC(fUency Encoding:
• Encode cau.:goricaf variables based on 1heir freq uency in the data~et.
(c) Embeddings:
• Use embeddings for ca1egorical variables, especially use ful in deep learning models.
(f) Dummy Variables:
• Create dummy variables for ca1egorical featu res wiih multiple levels.
3. Text Dula:
(n) Tokcnizalion:
•Break text into individual words or subwords (10kcn iza1ion).
(b) TF-IDF (Term Frequency-Inverse Document Frequency):
• Conver! 1cx1 da1a into numerical vectors using TF-fDF 10 caplure the impor1ance of words in a document.
(c) Word Embeddings:
• Use pre-Irained word embeddings like Word2Vec, GloVc, or Fas!Tex110 represent words in a continuous vector
space.
(d) Bag-of-Words:
• Represent 1ex1a~ a bag of words, counting the frequency of each word .
IS

l' Tt',t I rncth.

l X
n fopk \ IcxkUir. _:
• L ~ tr('h! U;;'

·,p' J' ft" '

L '-<ntrmcnt \ l\ilv,i-.

- • 1(1 ,.,
l 'v.:. .,., ~ ; "'I.:- .,.t~h ,,~ ...• '\'l ' ,, 'l \\\'l\ ..... , "'

FE.\Tl'RE SC..\Ll~G. n:..\ Tl'RES SEl FCTlO~

M., i'1 h',11111\'' ,,r d.11,1 In d.11.1 l'"'-"',,11\S,
\\ nat ~ Fraturr ~ca.linf I n,1 ·t'-""knt , ,11 ,., •
Ft .., ~" r ~ '~ nit·tt-..'\l u,,-.1 r., 'W iwl'1<' tlk' r.rn~.,_- ,, I ' h<· 111 I l'l\'l'"'"''''t\,I! ,tq•
II , M
, I., "(l~ 1 ..., •u..,J lll.)111\lhlllll\'ll ,I,_," ,:,'ll<:•-11
... , l )
l"·rt," ''""I ,h nnf 1 '· · f l1<'11!hl. \\ IIII l h' •II· I,II~· ,, tn.
I'· •I ' l I'' ,.,)
1-1 hh' I_I!<' ~.,1.11 ' • .llh • I
Fo• n:imp c . wu h.11 ~ mu•, r _. r,i.·1,.·nJ,·nt '.1r.1 , , • . , ,uh! h,·lr 1h<·111 .,11 h' ~- in t "-' , .11111· r.1n~~.
• • . ·11 1,·.1111n: ,, .1hll~ Ill
, eJr'- . ,..:-.vvv-
'\ . " " ' ' ·<
.' o . Euro, I.•1nJ (I --' \ ('·t'•f'I (\',!'('-lilt · ' 11111 • , . ihn•' 11,·hm,1111•
I f 1~ I •' I''· \ '·' 'I ullltrnr
for e,:unple. l-entt'n'd .m,unJ O,,, nth,· r.lllf<' Il · '
l I\ J ·1-.·n-t,n•· •'
'"
' '· '"
I·1 ,1 th. 111.I,·1,·n.knt '.111. '
1t,( ·, 1,1 ,1k1•h11 .,m ' I 1111
·
111
1n
1 1
In order h', ,ual •e 1he Jti-,,.: kt u, t.,1.,· •10 <'\;lltll' ' 'h. L'CI 111 .1.-tiin<' k.1111111~ n·p,,, 1<' ) bi' '''' ~ '· ' l\: 11\-
. h l I J,·r-\,11<-J ' 111 1 ' 1 111I 1I 1t '!
the 11ine dJt.1.-et fn,m the ·\\ 1111: [).11.i-.:t t •1 ' I s t 111 f mli1.1m 11l •' 1<' • ·'''
·hm lit'' 1-.:,,rn1.1h •.111,111 ·"'' • • ' .
impart ot the nw mo,t ,,,mmM ,,:ihnf lt't '1 • 1 h ylne dotosot
:::====~A~lc~o~h~o~I~an~d~M~a~li~c~A~c~id~c~o;n~ta:1~11'..:o::..:t~e-1_ _ _ _ __ _
,.,.
7
oco lnout scale
6 oco Standardi:ed [N(µ=O . <' = 1)1
••• ~, rH113X scaled [min=O. max=1]

'O
0
,d:
E
iii 2
~ .................. ,
................~.

-1

2ro
1s=------r
-2L _ __.i.o_ _ _~ s - - - : . 1;0;-------;-;
-S Alcohol

Fig. JA
The impacl of Swndardiza1ion and Normalisation on the Wine d111ase1
Methods for Scaling:
Now, since you have an idea of what is feature scaling. Let us explore what methods nrc available for doing fcatu~
scaling. Of all the methods available. the most common ones art!:
Normalization:
Also known as min-max scaling or min-max nonnalization. il is the simplest method and consists of rt:scaling the range
of features to scale the range in (0, IJ. The general formu la for normalimtion is given as:
, x - min (x)
x = max (x) - min (x)
Here. max(x) and min(x) are the maximum and the minimum ,alues of the feature respectil"ei}.
f.t1111111rott Alu111lll1111• 111 Al 1111,1Ml I II

\\, , 111 •1 '' "" 11101 1111 111.,1, .. 11 '''" d1ll,·1, 111 11111 l\,d~ 1t11, ,,1111pl,·, 1 h1111•,111r 111 h 1v..: 1111 \ 11 1,1hh L1y111)\ 111 •111 Y f,i, hi
'"" , 11 1 111,I h ill111~ 11 ti 1111111h(1\ 111,. \, ,It• 111 ,1111,r 1,1,111 ,., 11 11111 11111 ,111 M 1111 1,,1 11,.\ 111 , hi, 1h1 1111111111,1 hcc111111·,

\
' I I
I\ 111111 ~llih 111
111,1\I\I 111111( \l
\l11111h11111111111111
1' 111111 ,1.u1ot.,,,h 1111111 111 ,1~,·, lh1· , 11111, 111 1•,11 h 11 1111111 111 1111' d111.i have ,rin mr,111 1111(1 111111 var11111n· l ht general
1111 lhnd 11 1'.ii, 11 111111111 " "' d111•11111111· 1111· dl\l11h1111111111111111 111111\111111l,11d dn111111111 ll1r 1·11d1 lrn1u1c and c11k11l,11c !he new dal,I
p, 11111,,, 1111 l11l l1111 11lf li1111111l,1

l k1I' n "1111· ,11111d 11d d,·1111111111,111 h1· k-111 111 ,· v,•1 1111, 1111d, i, th1· 11vc111gc ol 1hi.: f..:u111rc vi.:~·1or.
~rnll 11 1,: lo 11 1111 h•n.:lh : I h1· 111111 11I 1111, 1111·1hod " 111 ,1·:ik 1h1• 1·11111pn111·111, nl II ka1uri.: vec1nr wch 1hat 1hc complete
,,., 1111 l111\ k11r1h 111w I h1, '"1111II ) 1111·1111, d111d111f l'llrh 1·11111p1111c111 hy 1hr h1chdc1111 lc111,t1h nl the vec101
\
11\11 11\111, 1h1· I 11L111k1111 lc11g11t ol 1hi.: f'l:atun: veclor.
111 .idd1t11111 11 1 1h1· .1ho1r I w1dl'iy 11\l'd 1111·thnd,. !here 11rc ,n111c 01hc1 1m:1hod~ to sc;ilc 1hc f'ca1urc, v11. Power
I 1,111sl,1111w1 , (.l11.11111k 11.111, l,11 1111·1 . Ruh1" 1 Sl'llk1 cir Fo, 1hi.: ~1·0111· ol 1hi, d,~cu~sion. we arc dclibcra1cly 1101 diving 1n10
till' drtad, 111 tlw,,· t,·, h111qu,·,
'1'111· 1111lllo11-floll11r 11111·,11011: Nol'llu11i1.11llt111 or· St11nd11nli:t.11tlo11
II you 11111 r 1' 1 l'I hut11 11 111111·l1111L' lt:11111 i11g pipd inc. ym, 11111st huvc 11lw11ys l'uccd this questiu11 or whether 10 Normalize or
111 S1.111tl,11d111· Wh1k 1111•11· i, 110 olwio11, an,wcr 111 1his quc,1ion. i1 really depends on the applica1ion. there arc Mill a
l1·11 ~•1·111•1.!11111111111, th,11 l':tll h1· dr11w11.
Nornmli111tlo11 i, g11nd In uw whi.:11 1hc di~11ihu1ion of' dat11 docs 1101 follow a Gaussian clist, ihution. II can he useful in
11l):1H 1th11h th111do 11nt 11,~11111e 1111y d1~11ibutin11 or Ilic da111 like K Nearest Ni.:ighbors.
In N1:111 ,1I Nc1wnr~, 11lg111i1h1111h11t rcquin: dutn 011 11 0 I scnlc. 11or11111li1111ion is an c~sen1ial prc-proccs,ing step. Another
p11p11 l11r 1·, 11111pk or d11111 11on1111 l11a1i1111 i, i11111gc procc,,i ng. where pixel i111cn~itics have 10 he normali,ccl to fit within a
1·,•1ta1111.111g1· (1..:., 010 25'i for lhi.: RGB colm runge).
St1111d11rdlznlio11 rnn he hdpful in cnscs wlwrc the t1111a follows a Gaussian distribution. Though thi~ doc~ not have 10 be
ni:c..:"111ily11 11..: Si 11t-c ,11111danli1a1io11 doi.:s not h11vc II bounding range, so, even if' there arc outliers in the duta, they will not
he 11i'i'l:e1cd hy ~tnnd11 rd11a1io11.
In t:l us1ering analyses. st11ndardi1atio11 comes in h1111dy 10 compare si milarities between features based on certain distance
lltl'Hst11'l:s. 1\notht:1 pro111in..:111 example is 1hc Pri11cip11I Co111ponent Analysis, where we usually prefer s1andardi1:a1ion over
tl'lin -M11, sl'llling since we urt: interested in the componcn1s thal 11111xi 111izc the variance.
Thl'.1..: arc so1m: poi nts which can be considered while di.:ciding whe1hcr we need S1andardiza1ion or Nornmli1a1ion
• S11111dnrdia11ion 111ny be used when datn reprcsen1 Gaussian Distribu1ion, while Normalization is great with Non-
Gau~sian Distribu1ion.
• lntpuct of Outlier~ is very high in Normali,ation.
To co11d11clc, you can always start by lilting your model to raw, normalized, and standardized data and compare the
pct l'on11anci.: for the best results.
The link between Dah1 Scnling und Dat:1 Leakage
To apply Normali,a1ion or S1andarcliLation. we can use the prcbuilt !'unctions in scikit-learn or can create our own
cus1on1 function.
Da1a lca~ngc mainly occurs when some information from the training data is revealed to the validation data. In order to
prcvcn1 1hc same. the point 10 pay u11en1ion to is to f'it 1he scaler on the train da1a and then use it to transform the test data.
Deline Feolurc Selection:
Feature Selection is cleli ned us, "Ir is a process of a11ro11wriclllly or 111c1111wlly selecri11g rhe .rnbser of 111osr llppropriare and
relt'w1111 jea1111·1•.1· ro ti,, 11.l'i'd i11 model b11ildi11g."
What is Feature Selection?
Feature is 1111 attribute that ha~ an impact on a problem or is useful for the problem, and choosing the important features
for the model i, known a~ f'ealure selection. Each machine h.:arning proces~ depends on feature engineering, which mainly
contains two proccs,es. which arc Feature Selection and Feature Extraction. Although feature selection and extraction
Advanced Algorllhms In Al and M
_L_ _ _ _ _ _ _ _ _ _1_10 Modol Sulnc.llon nnd f Nllllrii I nutn,, , "V
91

pr<X'C"C'' ma) h,1,c 1hc ,aml· nhjcc1i,c, hn1h arc complc1cl 7 thflcrcnl frnm carh olhu I he 111a 111 dil k rclll l lw1wc 1111~
1111
1h,11 lc,llmc ~clccllon ,, ahoul ,clccung 1hc ,uhscl ol 1he original fca1111c ,cl, whcrl·a, ri:1111111· cxli ,il.11011 <'lrlllC\ 111'\\ lt,1111rr
h:.ilurc ,clcc110n I\ a "a) of reducing the inpul hlrlahh: for the model hy U\111)-' onl y 1clcv11 11 1 d,il,i 111 ordr r Ii, rcrhitt
O\Cr1i11111g in 1hc modd
Sn. we can dcline feature Selection a,. "It i.1 a 11roces,1 of fl11lo11111tically or 111f11111ally .v1•/1•ct/11,: t/11• ,1i,/111•1 of 1111 1
11
appropriate aud re/evaut feat11res tfl he 11,ed ;,, model b11iltli11g " l·e.ilurc ,clcct10n " pcdo1111ed hy e1llll'1 111d11rl1ny lh,
1mpona.n1 fca1un!\ or excluding the irrclc,anl feature, in the d,11a,c1 wr1hnu1 ch.ing,ng them
1'cccl for Feature Selection:
Belorc 1mplemcnttnl! any 1cchn1quc. i1 " 1mponam 10 under,1and. need lor lhc 1ed1111que 1111d \II for Ilic f·c,,1t11c
SdeCllon. As we know, in machine learning. 11 1, ncce~sary 10 provide a pre-prncc,scd and good 111pu1 data~cl 10 gel heller
outcomes. We collect a huge arnounl of data 10 train our model anti help ii to learn heller. Ucnerully, lhc daur~c, c1111~1,1, Ill
1101\} data. trrelcvant data. anti some pan of useful dala. Moreover, the huge amounl of data ,il,o ,low, down lhc 1r,11n111g

proce,, of the model. and "11h noise anti 1rrelevan1 da1a, the model may nol prcdrcl untl perlonn well. So. ii " very m:ccm,y
to rcmO\ e ,uch noises and lc,~-impon.int data from 1he da1a,c1 and 10 do 1hi,, and l·ea\Urc selccliun 1cd1111quc, arc u~cd.
Selecting 1hc be\l fcaturt:, help, 1he mo<lel 10 perform well. For example, wppo,e we wunl lo create a model 111111
automaticall) decide, which car , hould be crushed fo1 a spare pan, and 10 do thl\, we have a datu,el. Tiu, dat,l\cl Lc1111ai11, ;,
Model of lhc car. Year, Owner', name. Miles. So, in this dawset. the name of 1he owner docs nol con1ribu1e 10 the model
performance a, ii docs nm decide if the car ,hould be crushed or not. ,o we can remove 1his coluJJrn and ,elccl lhc re~, of lhc
features (column) for lhc model building
Belo,~ are ,ome benefils of using fc,11ure selection in macl1111c learning:
• It helps 111 avoiding the cur,e of dimensionality.
• 11 help, in the simphfica1ion of 1hc model so that it can be easily in1erpretcd by the re-,carchcr,.
• II reduces the training time .
• It reduces overliuing hence enhance the gcnerali7alion.
Feature Selection Techniques:
There arc mainly two types of Feature Selection techniques, which arc:
• Supervised Feature Selection technique: Supervised Feature selection techniques consider 1he 1argcl v,criahlc and
can be used for the labelled dataset.
• Unsupervised Feature Selection Lechniquc: Unwpcn 1,cd Feature \elcclion techniques ignore lhc turgcl varrahlc
and can be used for the unlabelled dataset.

Feature Selection Techniques

Supervised Unsupervised
Feature Selection Feature Selection

Filters Embedded Wrappers

Method Method Method

Regularization
Missing value Forward Feature Selection
L1,L2
Random forest Backward Feature Selection
Information gain Importance

Chi-suqare Test Exhaustive Feature Selection

Fisher's Score Recursive Feature Elimination

Fig. 1.5
1h ♦ 1
I 111111 I I• 111!

1l I 11 M,Hl~I pl ,1111111111

\\ Ii , \l,,t, l \ ,I,, h,111'

\\ h, \ h'>l, I~, l,·,111,11'
1

111 \\ 1\1 \ 'h,,,,,, lh,• ll,•,, \1.,.1.-1,11 ~l.t.hllh' I , 1111111,· I

'I
I\,,\\ h' wl,\ l ,1 m,,,l,•I h.1,,•,I ,111 th,• l ,I\~'

1\ •,, 11h,· 1,-.111,11, s,-.1111,f

I 'l'l.1111 h -.11111,, 'wl,\·111111

.Indd 1 2016-05-17 10:18:46
No ratings yet
.Indd 1 2016-05-17 10:18:46
15 pages
KP3S Pro V2 Instructiones-EN
No ratings yet
KP3S Pro V2 Instructiones-EN
10 pages
Choosing Model and Tuning
No ratings yet
Choosing Model and Tuning
20 pages
CSO504 Machine Learning: Evaluation and Error Analysis Validation and Regularization Koustav Rudra 22/08/2022
No ratings yet
CSO504 Machine Learning: Evaluation and Error Analysis Validation and Regularization Koustav Rudra 22/08/2022
28 pages
ML - Module 5
No ratings yet
ML - Module 5
80 pages
Guide
No ratings yet
Guide
24 pages
chapter 1 capstone project ai class 12
No ratings yet
chapter 1 capstone project ai class 12
5 pages
TR Rain Error
No ratings yet
TR Rain Error
6 pages
Lec 10
No ratings yet
Lec 10
36 pages
K Fold
No ratings yet
K Fold
25 pages
19_ML_intro
No ratings yet
19_ML_intro
33 pages
4 Model Order
No ratings yet
4 Model Order
10 pages
Model Selection NEW
No ratings yet
Model Selection NEW
24 pages
L2_Problems in ML & Performance Evaluation - Copy
No ratings yet
L2_Problems in ML & Performance Evaluation - Copy
30 pages
ML 04 Validation Regularization
No ratings yet
ML 04 Validation Regularization
57 pages
Quiz 1 Materials
No ratings yet
Quiz 1 Materials
159 pages
Model Generalization
No ratings yet
Model Generalization
117 pages
Lecture Slide 02 - Supervised Learning - Summer 2023
No ratings yet
Lecture Slide 02 - Supervised Learning - Summer 2023
43 pages
Practical Issues
No ratings yet
Practical Issues
30 pages
T1 ML QB Soln
No ratings yet
T1 ML QB Soln
23 pages
14 Model Selection and Boosting
No ratings yet
14 Model Selection and Boosting
51 pages
Xchapter 1
No ratings yet
Xchapter 1
31 pages
ML 1 2 3
No ratings yet
ML 1 2 3
54 pages
AI & ML Notes
No ratings yet
AI & ML Notes
22 pages
Machine Learning Tutorial Machine Learning Tutorial
No ratings yet
Machine Learning Tutorial Machine Learning Tutorial
33 pages
ML Unit IV.pptx
No ratings yet
ML Unit IV.pptx
70 pages
Machine Learning Lecture1 - 26-27 Aug
No ratings yet
Machine Learning Lecture1 - 26-27 Aug
30 pages
ML Unit 2
No ratings yet
ML Unit 2
33 pages
P-2.1.2 Cross Validation and Regularization
No ratings yet
P-2.1.2 Cross Validation and Regularization
37 pages
Module 3 Data Science Machine Learning
No ratings yet
Module 3 Data Science Machine Learning
53 pages
Module 6
No ratings yet
Module 6
24 pages
Unit III - I
No ratings yet
Unit III - I
15 pages
CSC407_Chapter 5-6
No ratings yet
CSC407_Chapter 5-6
42 pages
07 Intro to ML
No ratings yet
07 Intro to ML
38 pages
Model Validation & Data Partition
No ratings yet
Model Validation & Data Partition
14 pages
Chapter2 1 33
No ratings yet
Chapter2 1 33
18 pages
XIIAIUNITICAPSTONE_PROJECTPARTII
No ratings yet
XIIAIUNITICAPSTONE_PROJECTPARTII
11 pages
ML-Unit 2
No ratings yet
ML-Unit 2
15 pages
AI UNIT 5
No ratings yet
AI UNIT 5
13 pages
19 ML Intro
No ratings yet
19 ML Intro
31 pages
L7.1.AI
No ratings yet
L7.1.AI
127 pages
1 (8 29) Supervised Learning
No ratings yet
1 (8 29) Supervised Learning
60 pages
AAM UNIT 1 QB WITH ANSWER
No ratings yet
AAM UNIT 1 QB WITH ANSWER
12 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
116 pages
unit 4
No ratings yet
unit 4
34 pages
ML for Petrphysical Evaluation
No ratings yet
ML for Petrphysical Evaluation
18 pages
SML Updated UNIT 4
No ratings yet
SML Updated UNIT 4
44 pages
ML Unit-3 - RTU
No ratings yet
ML Unit-3 - RTU
20 pages
ML 01
No ratings yet
ML 01
24 pages
ML Notes_compressed_organized
No ratings yet
ML Notes_compressed_organized
84 pages
Lecture 9 - Evaluations
No ratings yet
Lecture 9 - Evaluations
68 pages
Lecture 2.1 - AML
No ratings yet
Lecture 2.1 - AML
32 pages
Ml Ese 031223 Openbook
No ratings yet
Ml Ese 031223 Openbook
4 pages
Chapter 7 Learning
No ratings yet
Chapter 7 Learning
34 pages
Machine Learning Updated
No ratings yet
Machine Learning Updated
14 pages
Best Practices
No ratings yet
Best Practices
16 pages
Workflow of A Machine Learning Project
No ratings yet
Workflow of A Machine Learning Project
12 pages
UNIT-2
No ratings yet
UNIT-2
20 pages
Model Evaluation
No ratings yet
Model Evaluation
39 pages
dsp QB
No ratings yet
dsp QB
15 pages
Programmable Logic Device New
No ratings yet
Programmable Logic Device New
7 pages
W-22 Model Answer 22397 .Final
No ratings yet
W-22 Model Answer 22397 .Final
23 pages
MNGT 1000 MCQS BY V2V
No ratings yet
MNGT 1000 MCQS BY V2V
133 pages
AN ONLINE EXAM
No ratings yet
AN ONLINE EXAM
13 pages
Super 25 Unit 1 and Unit 2 (1)
No ratings yet
Super 25 Unit 1 and Unit 2 (1)
15 pages
Scribd2 PDF Free
No ratings yet
Scribd2 PDF Free
3 pages
Hydroponics Project
100% (1)
Hydroponics Project
60 pages
An Industry Vision For Offers and Orders: Airline Retailing
No ratings yet
An Industry Vision For Offers and Orders: Airline Retailing
26 pages
Lab2 - Java Variables
No ratings yet
Lab2 - Java Variables
4 pages
Challenge-Drills - JO and JF
No ratings yet
Challenge-Drills - JO and JF
2 pages
Transmission Development Plan NGCP
No ratings yet
Transmission Development Plan NGCP
37 pages
Form 02 - EQUIPMENT MAINTENANCE SCHEDULE
No ratings yet
Form 02 - EQUIPMENT MAINTENANCE SCHEDULE
2 pages
Manual Acer 5745
No ratings yet
Manual Acer 5745
233 pages
REDHAT Linux - Linux Terminal Server Using XRDP
No ratings yet
REDHAT Linux - Linux Terminal Server Using XRDP
3 pages
Case Study: The Timber House
No ratings yet
Case Study: The Timber House
4 pages
HTS_Products_2023_11_13
No ratings yet
HTS_Products_2023_11_13
28 pages
Analysis of Chemical Plant Heat Exchanger Explosion
No ratings yet
Analysis of Chemical Plant Heat Exchanger Explosion
1 page
VirusTotal - URL - Fad4ff177b644cd569f3d89e449432
No ratings yet
VirusTotal - URL - Fad4ff177b644cd569f3d89e449432
1 page
What Is Authentication
No ratings yet
What Is Authentication
3 pages
Concrete Mix Design: STEP 1: Choice of Slump
No ratings yet
Concrete Mix Design: STEP 1: Choice of Slump
12 pages
Cwipedia - in-eTI MCQ Emerging Trends in Computer Eng and Information Technology MCQ Chapter 1artificial Intelligen
No ratings yet
Cwipedia - in-eTI MCQ Emerging Trends in Computer Eng and Information Technology MCQ Chapter 1artificial Intelligen
13 pages
IQ - Installation Qualification / OQ - Operation Qualification
No ratings yet
IQ - Installation Qualification / OQ - Operation Qualification
6 pages
physics ppt 16
No ratings yet
physics ppt 16
19 pages
Bon Monitor User Manual
No ratings yet
Bon Monitor User Manual
40 pages
B646.01.00.10.009_B_P&ID for Instrument Air Line System For WHRSG. (2 X ...
No ratings yet
B646.01.00.10.009_B_P&ID for Instrument Air Line System For WHRSG. (2 X ...
1 page
ACFI_7009 Assignment 24-25 (2)
No ratings yet
ACFI_7009 Assignment 24-25 (2)
6 pages
Syberia Walkthrough
No ratings yet
Syberia Walkthrough
3 pages
Supplier Implant RSSR 2021
No ratings yet
Supplier Implant RSSR 2021
4 pages
Date: 22 Jan 2020: Nagaraj 793/9 Mathru Krupa Jayanagara A Block Davangere
No ratings yet
Date: 22 Jan 2020: Nagaraj 793/9 Mathru Krupa Jayanagara A Block Davangere
4 pages
Impinj SpeedwayR Installation and Operations Guide 6.4
No ratings yet
Impinj SpeedwayR Installation and Operations Guide 6.4
93 pages
SCS2 Series Catalog (2MB)
No ratings yet
SCS2 Series Catalog (2MB)
51 pages
Avx BZ05FB682ZSB
No ratings yet
Avx BZ05FB682ZSB
26 pages
Dimensions: G A53 A53g
No ratings yet
Dimensions: G A53 A53g
2 pages

Unit 1 Advanced Algorithm

Uploaded by

Unit 1 Advanced Algorithm

Uploaded by

1...

Model Selection and

I Learning Objectives ...

ID INTRODUCTION: SELECTING A MODEL

II I I 'II 1111 I 111111 Ill I IIJ1>d, I

Cros'l Validation II Bootstrap I

'11 11.:n: ,11 i; two main 1ypes of' re-1,a111pling 1echniqucs:

Fig. 1.2: Cross-Validation Example ... c

A~11 ll 1 1111111111ull1111 I 1111111111 IA II 1 ,,, 1/// 11~1J,' ,,, I/ti,' t1 I ,,,

Ml11l111111111J,·,c 1l111l111 • 1~'""'" (M Iii ,): 1,,

Mill , 1.(111 It,(',:)

I.(hJ ,~ lhl' 111 ,111111'1 111

B11,v1·,h111 l11for11mlf1111 ( 'dH•1·l1111 /Ill( 'J:

·' I I ,11111 I "' Ill 'II Ill)!:

(hl Trnin lht· \l ndl'I :

~) / Raw data Features Insights

f 1·11 1111 I I 1111111·1·1i111::

l' Tt',t I rncth.

·,p' J' ft" '

FE.\Tl'RE SC..\Ll~G. n:..\ Tl'RES SEl FCTlO~

Feature Selection Techniques

Filters Embedded Wrappers

Chi-suqare Test Exhaustive Feature Selection

Fisher's Score Recursive Feature Elimination

\\ Ii , \l,,t, l \ ,I,, h,111'

111 \\ 1\1 \ 'h,,,,,, lh,• ll,•,, \1.,.1.-1,11 ~l.t.hllh' I , 1111111,· I

1\ •,, 11h,· 1,-.111,11, s,-.1111,f

You might also like