0% found this document useful (0 votes)
120 views5 pages

Goodness-of-Fit Test Explained

This document provides an overview of goodness-of-fit tests for categorical data. It defines goodness-of-fit tests as measuring how well observed data fits an assumed model. The Pearson and deviance test statistics, X2 and G2, are introduced to quantify the goodness of fit. Large values of these statistics indicate a poor fit between the data and model. Their distributions approach a chi-squared distribution as the sample size increases, allowing one to calculate p-values to test if the model fits. Examples are given of applying these tests, including to test if dice rolls follow equal probabilities.

Uploaded by

Nita Ferdiana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
120 views5 pages

Goodness-of-Fit Test Explained

This document provides an overview of goodness-of-fit tests for categorical data. It defines goodness-of-fit tests as measuring how well observed data fits an assumed model. The Pearson and deviance test statistics, X2 and G2, are introduced to quantify the goodness of fit. Large values of these statistics indicate a poor fit between the data and model. Their distributions approach a chi-squared distribution as the sample size increases, allowing one to calculate p-values to test if the model fits. Examples are given of applying these tests, including to test if dice rolls follow equal probabilities.

Uploaded by

Nita Ferdiana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

STAT504

AnalysisofDiscreteData

2.4GoodnessofFitTest
Printerfriendlyversion (https://onlinecourses.science.psu.edu/stat504/print/book/export/html/60)
Agoodnessoffittest,ingeneral,referstomeasuringhowwelldotheobserveddatacorrespondto
thefitted(assumed)model.Wewillusethisconceptthroughoutthecourseasawayofcheckingthe
modelfit.Likeinalinearregression,inessence,thegoodnessoffittestcomparestheobserved
valuestotheexpected(fittedorpredicted)values.
Agoodnessoffitstatisticteststhefollowinghypothesis:
H0:themodelM0fits
vs.
HA:themodelM0doesnotfit(or,someothermodelMAfits)
Mostoftentheobserveddatarepresentthefitofthesaturatedmodel,themostcomplexmodel
possiblewiththegivendata.Thus,mostoftenthealternativehypothesis(HA)willrepresentthe
saturatedmodelMAwhichfitsperfectlybecauseeachobservationhasaseparateparameter.Laterin
thecoursewewillseethatMAcouldbeamodelotherthanthesaturatedone.Letusnowconsider
thesimplestexampleofthegoodnessoffittestwithcategoricaldata.
Inthesettingforonewaytables,wemeasurehowwellanobservedvariableXcorrespondstoaMult
(n,)modelforsomevectorofcellprobabilities,.Wewillconsidertwocases:
1. whenvectorisknown,and
2. whenvectorisunknown.
Inotherwords,weassumethatunderthenullhypothesisdatacomefromaMult(n,)distribution,
andwetestwhetherthatmodelfitsagainstthefitofthesaturatedmodel.Therationalebehindany
modelfittingistheassumptionthatacomplexmechanismofdatagenerationmayberepresentedby
asimplermodel.Thegoodnessoffittestisappliedtocorroborateourassumption.
ConsiderourDiceExample(/stat504/sites/onlinecourses.science.psu.edu.stat504/files/lesson02/dice_example.png)from
theIntroduction.Wewanttotestthehypothesisthatthereisanequalprobabilityofsixsidesthatis
comparetheobservedfrequenciestotheassumedmodel:XMulti(n=30,0=(1/6,1/6,1/6,1/6,
1/6,1/6)).Youcanthinkofthisassimultaneouslytestingthattheprobabilityineachcellisbeing
equalornottoaspecifiedvalue,e.g.
H0:(1,2,3,4,5,6)=(1/6,1/6,1/6,1/6,1/6,1/6)

vs.
HA:(1,2,3,4,5,6)(1/6,1/6,1/6,1/6,1/6,1/6).
Mostsoftwarepackageswillalreadyhavebuiltinfunctionsthatwilldothisforyouseethenext
sectionforexamples(https://onlinecourses.science.psu.edu/stat504/node/61)inSASandR.Hereisastepbystep
proceduretohelpyouconceptuallyunderstandthistestbetterandwhatisgoingonbehindthese
functions.
Step1:Ifvectorisunknownweneedtoestimatetheseunknownparameters,andproceed
toStep2IfvectorisknownproceedtoStep2.
Step2:Calculatetheestimated(fitted)cellprobabilities^ s,andexpectedcellfrequencies,
Ej'sunderH0.
j

Step3:CalculatethePearsongoodnessoffitstatistic,X2and/orthedeviancestatistic,G2
andcomparethemtoappropriatechisquareddistributionstomakeadecision.
Step4:Ifthedecisionisborderlineorifthenullhypothesisisrejected,furtherinvestigate
whichobservationsmaybeinfluentialbylooking,forexample,atresiduals(../node/62).

Pearsonanddevianceteststatistics
ThePearsongoodnessoffitstatisticis
k

^j )
(X j n

=
^j
n

j=1

Aneasywaytorememberitis
2

(O j E j )
=
Ej

whereOj=Xjistheobservedcountincellj,andE

^j
= E(Xj ) = n

istheexpectedcountincellj

undertheassumptionthatnullhypothesisistrue,i.e.theassumedmodelisagoodone.Noticethat
^ istheestimated(fitted)cellproportion underH .

j
0
j

Thedeviancestatisticis
k
2

= 2 Xj log (
j=1

Xj
)
^j
n

where"log"meansnaturallogarithm.Aneasywaytorememberitis
2

Oj
= 2 Oj log (
j

)
Ej

Insometexts,G2isalsocalledthelikelihoodratioteststatistic,forcomparingthelikelihoods
(http://onlinecourses.science.psu.edu/stat504/node/27)(l0andl1)oftwomodels,thatiscomparingthe

loglikelihoodsunderH0(i.e.,loglikelihoodofthefittedmodel,L0)andloglikelihoodunderHA(i.e.,
loglikelihoodofthelarger,lessrestricted,orsaturatedmodelL1):G2=2log(l0/l1)=2(L0L1).A
commonmistakeincalculatingG2istoleaveoutthefactorof2atthefront.
NotethatX2andG2arebothfunctionsoftheobserveddataXandavectorofprobabilities.Forthis
reason,wewillsometimeswritethemasX2(x,)andG2(x,),respectivelywhenthereisno
ambiguity,however,wewillsimplyuseX2andG2.Wewillbedealingwiththesestatistics
throughoutthecourseintheanalysisof2wayandkwaytables,andwhenassessingthefitoflog
linearandlogisticregressionmodels.

TestingtheGoodnessofFit
X2andG2bothmeasurehowcloselythemodel,inthiscaseMult(n,)"fits"theobserveddata.
Ifthesampleproportionsp j=Xj/n(i.e.,saturatedmodel)areexactlyequaltothemodel'sjfor
cellsj=1,2,...,k,thenOj=Ejforallj,andbothX2andG2willbezero.Thatis,themodelfits
perfectly.
2
2
^ 'scomputedunderH ,thenX andG areboth
Ifthesampleproportionsp jdeviatefromthe
0

positive.LargevaluesofX2andG2meanthatthedatadonotagreewellwiththe
assumed/proposedmodelM0.

HowcanwejudgethesizesofX2andG2?
Theanswerisprovidedbythisresult:
IfxisarealizationofXMult(n,),thenasnbecomeslarge,thesamplingdistributionsofboth
X2(x,)andG2(x,)approachchisquareddistribution (http://onlinecourses.science.psu.edu/stat504/node/23#chi
squared)withdf=k1,wherek=numberofcells,2 k1.
ThismeansthatwecaneasilytestanullhypothesisH0:=0againstthealternativeH1:0for
someprespecifiedvector0.AnapproximateleveltestofH0versusH1is:
RejectH0ifcomputedX2(x,0)orG2(x,0)exceedsthetheoreticalvalue2k1(1).
Here,2k1(1)denotesthe(1)thquantileofthe2k1distribution,thevalueforwhichthe
probabilitythata2k1randomvariableislessthanorequaltoitis1.Thepvalueforthistestis
theareatotherightofthecomputedX2orG2underthe2k1densitycurve.Belowisasimplevisual
example.Considerachisquareddistributionwithdf=10.Let'sassumethatacomputedteststatisticis
X2=21.For=0.05,thetheoreticalvalueis18.31.

(/stat504/sites/onlinecourses.science.psu.edu.stat504/files/lesson02/chisqdistributions.R)

UsefulfunctionsinSASandRtorememberforcomputingthepvaluesfromthechisquare
distributionare:
InR,pvalue=1pchisq(teststatistic,df),e.g.,1pchisq(21,10)=0.021
InSAS,pvalue=1probchi(test statistic,df), e.g.,1probchi(21,10)=0.021
YoucanquicklyreviewthechisquareddistributioninLesson0
(https://onlinecourses.science.psu.edu/stat504/node/23),orcheckout
http://www.statsoft.com/textbook/stathome.html(http://www.statsoft.com/textbook/stathome.html)and
http://www.ruf.rice.edu/~lane/stat_sim/chisq_theor/index.html
(http://www.ruf.rice.edu/%7Elane/stat_sim/chisq_theor/index.html).TheSTATSOFTlinkalsohasbriefreviewsof
manyotherstatisticalconceptsandmethods.
Hereareafewmorecommentsonthistest.
Whennislargeandthemodelistrue,X2andG2tendtobeapproximatelyequal.Forlarge
samples,theresultsoftheX2andG2testswillbeessentiallythesame.
Anoldfashionedruleofthumbisthatthe2approximationforX2andG2workswellprovided
thatnislargeenoughtohaveEj=nj5foreveryj.Nowadays,mostagreethatwecanhave
Ej<5forsomeofthecells(say,20%ofthem).SomeoftheEj'scanbeassmallas2,butnoneof
themshouldfallbelow1.Ifthishappens,thenthe2approximationisn'tappropriate,andthe
testresultsarenotreliable.

Inpractice,it'sagoodideatocomputebothX2andG2toseeiftheyleadtosimilarresults.Ifthe
resultingpvaluesareclose,thenwecanbefairlyconfidentthatthelargesampleapproximation

isworkingwell.
IfitisapparentthatoneormoreoftheEj'saretoosmall,wecansometimesgetaroundthe
problembycollapsingorcombiningcellsuntilalltheEj'sarelargeenough.Butwecanalso
performasmallsampleinferenceorexactinference.WewillseemoreonthisinLesson3
(https://onlinecourses.science.psu.edu/stat504/node/89).Pleasenotethatthesmallsampleinferencecanbe
conservativefordiscretedistributions,thatismaygivealargerpvaluethanitreallyis(e.g.,for
moredetailsseeAgresti(2007),Sec.1.4.31.4.5,and2.6Agresti(2013),Sec.3.5,andfor
BayesianinferenceSec3.6.)

Inmostapplications,wewillrejectthenullhypothesisXMult(n,)forlargevaluesofX2or
G2.Onrareoccasions,however,wemaywanttorejectthenullhypothesisforunusuallysmall
valuesofX2orG2.Thatis,wemaywanttodefinethepvalueasP(2k1X2)orP(2k1
G2).VerysmallvaluesofX2orG2suggestthatthemodelfitsthedatatoowell,i.e.thedatamay
havebeenfabricatedoralteredinsomewaytofitthemodelclosely.ThisishowR.A.Fisher
figuredoutthatsomeofMendel'sexperimentaldatamusthavebeenfraudulent(e.g.,see
Agresti(2007),page327Agresti(2013),page19).
2.3.3MultinomialSampling
(/stat504/node/59)

up

2.5ExamplesinSAS/R:DiceRolls&
(/stat504/lesson2)
Tomato(/stat504/node/61)

You might also like