0% found this document useful (0 votes)
5 views6 pages

ML Assignment 1

Uploaded by

it12212023
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views6 pages

ML Assignment 1

Uploaded by

it12212023
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

on 1ohet ane bias, vajance in Bundeyitting- Why can

a mode oith higt bias gB low ovegileo


o yaniane be lemath?
Bias: Eer due to highly mplisti ancmptivns in oa mactine
-Voian e: Veor du to ex Ce e tomplexity Reading to
sei tiity to bmcll luctuations in baining data.
(Highvnce: Ouenj'ng)
34 a mo de s to Aimple (hih bios) and fails to aptwe
patlens in data Clou'vaiance) eoads to poor pentomane
on botth
trainig k testina
in a 5-fold crpss-U alidation,
but' eNaluatt on the you qet only 7o.-
set
what ae possible easons on this mis matel?
o0 Data lakage Jntormation rom thedata se mijht have
Rakecl into traning (eq- haprocasny be,ou spteing.
Diherent distibtion: Tya test sets Come foom difennt
dittibutioy
on validation: The modol may have been tned
uckon cwss-va dati on» keading to por generalizatio.
teo much aize
oy small teot
Rardomne

You build 3 diderent mlc leaning models fox clasi fication


o3)
taak, but none o hem individally achieve high accwacy
pedictionshe cuea
Ho>ever , cohen yu combine theih
pengonnce ioves 3 y h tttokil
haw dodo methods ke bagaing Boostinq
U)d ee, and' haw 0
molel
Naclking contibutt to mp Moig
poyonance

’ h e technigw Used here is Called E


Aveoae,"uttple modele Fnsemble a
Jt involves co mb
08ing
fidicti ond meltiple modely trained on
Jdiferntsubs Sub set ’Redu ces voian ce

· Boasting:3terutiely impoves onweak imodels by training them


focusingmis clasifie eah
suenti ally
CoMectin previusotome ’ Reduces bias.
Stacking: Ubesa meta- m
odel to learn ho to best combine
fredi ctios rommulti pe bese models
CHJ You ae building an ML mudcl need to enwde cattgi cal
building
eatursize' wh Vales snall,medium, kag weuld
ue One-Het Encding n Ordi nal Encding ? Ho would y e u s
Choice alect he moclelodol , and in what' situqtins Jotd
encaling methed be prelered oven the ether ?
Oxdiral ercoding 3f theni is meaningtul ode
" One -Hot encong:3f there s no aninglul arder.
Jn thiS Case we will we inal encoding as thre
meaningful alea (Se: Snall < mei un<
- choicel smpoct ango)
* Ondi nal may mislead utalthms it pacng isnt unifom
One- Hot avos bias but incre aes dimension
incea

as] Can gou use supervised keaning on ualabe lled datases?


not ?

No, supeviaed keanning egie labelled data


Snstead we can
) Unsuper vised : hen dataset has unlabeled data L d.scoyeres
Aterns without exphuit supeAvison led data s ained
labe
i) Semisupervised: Smal amdnt of
oith laye wnlaheled data (Suervi sed nsupenvise).
i) self supeavised. Madel learn fom un laleled data by
its oon lebels

linean eges ien model, but


but you notice
QJYou þuild a mult ple
Vaniables
hademeoe,
o nelation.
predicto
that 2
aeit yowr model, an shat oblems
How miglt this
Could ise ? to detemine
Read to multi co llinearity whh makes it hard
Ot
the x ue each piedi cton
ss intepleta ble
OCo-eicient become unatable and
OModel may over- aely on
shile igno kin
oth ens

Poor geneali zatibm due to high


vani an ce.
ojwhat technigue can
can yeu we to detect & han die mu lti co lineasty
to impe eliabiliy Myon negresion model ?
Detcct :
:)o elation matix
i0 Vaiance Brtation factor (vzf >5) i VIF = lo it

indicate muticlineaiby
.Handle
co Je lated leatures
id use Pc A Principae Component tnalysis) tto neduce
dimensionality
>Apply negukaization (laso /Ridye greasion).
Considered a
good pa ctice ?
) mpoves moolel
ii) prevents muItt colli neaity isa ues in gretion.
a test point X- (3,4) and nng Points: A= C2, 2),
(5 (4, S), c: (6.8) D: Cus) .(se
fu cide n dist an ce to fond
the 2 neanest neighbows to x.
A: Jo-+Cu-2) 2-24
2
G:(3-u) + (u-) =J 2+
CiN3-) '+ (4-)
D:J(3)+(4ar)J2+1 2.2H

:. Neaest negh bo wis: B(4) A or D(224)


les but lo tines
Suppose you hae ioo tvaning saur
eatre. What poten tial iss ues could tiis cawwe,
ancl how would gou addres then ?
Issue:
DOverfitting
i) Poox qenirali2athon
Spase oata
Solwtioms:
) featwe selection ( Filer /wragper | Ewbedded methds)
i) imensioonality ed ucton (Pca)
segulanizatiòn lawo) Rdye).
o 3 2 vaniableshave a high covaniane, does it always mean

hay ane styongly oJhelated ? Why or ahy not ?? Mow" does


Covanin ce in trms intepetatin
and ccale 7
igh co yuiance doesnt m'y stong Co Melatjon
Coyaiane deperds On the
3cale f Vavables .
CoNelotion nomaliees ovaiance
maki"1 inlendent
aj Scale
Covaianc e Corelation
Aspect
De finition Measwres toint vaniability 4- Measwes st nength dire
2 Vahiab le o linean nelation ship.
Scale Depends on wnits a 1he Stan dardi zed(alwayt
Vaniables between -1 and).

" hard to interpret Cean interpetatio


Snteopatatig masnetule
& le mnati 2atlen to the wosd *
dnj 34 you oppy seming ble otpus ? Hos do these vnning'
ttchniueA
what could be the poss
in thei appoach to DOrd nomaization and when
wuuld òver the other n NLP task ?

'temmin
vunwing vunn ( remuves
Lemmatizatio n run (corect base or d)
Aspet Stemmira Lemwathi2ation
Appsoach Rule- baJecd, chbps ott ues vocab ulary
ahd endings mer phalogial analgsis
Accway Less "Mor
Speed faste Slower
Centext . Yes
aweness
stemmi ng when :
Speed is iteal.
inaccWaieA are acceptable
Lemmatiz ati o when
rye dfors ms
matlen
i) woei? oith bmall latasets.
Q13 Adoc ton tet a patient jon a dis ease. Sf the teat i
indi cote the paésen ce ay the digease
incouecty
when patient is
acualy haltly, what type y enra is this
this 2 Conjersely. if the
?
test Jaalk to detece the dsese when the paient
actually has it, shat type e is this ? HMow do Type z
anel
e ype I eo mpoct eal world olec'sion -mmkin. ahd
in which Aituationd ould minimi2ing One be more
than the othe
1 eY (false Positive):
in comety sas the
.The test in patient has t e disease
when they ane bctuaey healthy
Typt evon (False
The tet in oectlyNeyatve):
says the patient is
when they actuall hau dise ase
healrhy
Ipoct on ea w0ld decision masking treatets.
i ntervention rleading to sevee
IL:
Delaged
outcowes, sk to publie say
Minimizing
Avoiding
Type I mare iticol:
wsongfel detections
incorect alms
. MinimiZi e mre coitical:
Avoj di h Shissed detectien nea theat
oH] Smagine you ae buildino a cedit Jusk predictton medel,
andne one town ingut feates ia "Loal approyal statu
which is dteiminel'alter molel's predictioh 'The molel
achieves extemely high dcusaey what is sue does this in diatt,
nd what step' Can you take to pheUent duchA peblems olhen
machine leanning moleli?
>Issue here isData Leakaye.
JHere the model wyainy learns tom hute infor mation.
i) The model may cheat by nelying en this futue
prediction instcd a enine picute
Solution:
DRemeve leaky Jeature
1) Use time. besed splits.
disJ Exp kain candidate elimination prcblen and s
alaothm.
Candidate elimnotion:
ost is a ML method used in nductive leanning
90al iJ to nao down the ion space.
es ion

OTne pro bem is jinding all ansistert hypthewis fron


examples i computatioly bgjicalay chaling ing
S algoni thm:
nyt is a dimpli fred veLsion &above that cny r ack
2
e ets:
mett specfe hypoth esis cs) instee do af min
) reneral hypoth esis G.
) Specikie hypothes 6).

You might also like