0% found this document useful (0 votes)
27 views11 pages

Analysis of Variance

Statics

Uploaded by

Subhan Tariq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
27 views11 pages

Analysis of Variance

Statics

Uploaded by

Subhan Tariq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 11
nalysis of Variance Learning Objectives When you have completed this chapter, you will be able to: LOM, List the characteristics of the F distribution. LOZ Conducta test of hy- pothesis to determine whether the variances of two popula- tions are equal. L03 Discuss the concept of analysis of variance. LO4 Conduct a test of hypothesis among three or ‘more treatment means. LOS Organize data into a ‘one-way ANOVA table, 106 Develop confidence intervals forthe difference in treatment means. Anew computer is sold with innovative flash memory. When compared {0.computers with conventional memory, the new machine is clearly faster, but initial tests indicate there is more variation in processing time. A sample of 16 Computer runs showed that the standard deviation Of the processing time ‘was 22 (hundredths of a ‘second) for the new: Imachine and 12 hundredths ofa second) forthe current machine. At ‘he 05 significance level, can we conclude that there is more variation in the processing time of the new ‘machine? (Exercise 16, LO2,) 355 Introduction (0) In this chapter we conti Continue our discussi ters 10 and 11 we examingout “86ussion of hypothesis testing, the case where © ined the genx fe Sis testings Recall that in Chap- of hypothe ir rs bution (the standarg me? "2S selected from the me eotnesis testing. We described ee the ion. We Si / YE sable concer pat oe ttn to cine We tested whetnos mouse that he population a toa spected valuc, one- and two-sample eoPulation means ae t tribution as the distor Fpbulation proportions, usin of hypothesis teste, We desorit neously compares several mene and then a test that simulta- they came from equal populations. @ {ie £Bitibuon ulation means is calléd analysis of vari- nce lons, the populatigns must follow a normal distribution, and the data must be at least interval-scala, ® What are the characteristics of the F distribution? LON Ustthe _—There is a family of F distributions.)A particular member of the family is deter- uaracesistcs of the "The degrees of freedom in the numerator and the Féstibuton. degrees of freedom in the denot minator. The shape of the distribution is illus- salees, trated by the following graph. There is one F distribution for the combination of C hae HSS 29 degrees of freedom in the numerator (df) and 28 degrees of freedom in the -) Jue, ty denominator. There is another F distribution for 19 degrees in the numerator and Aisilyhion ‘— 6 degrees of freedom in the denominator. The final distribution shown has F- ~ 6 degrees of freedom in the numerator and 6 degrees of freedom in the denom- inator. We will describe the concept of degrees of freedom later in the chapter. Note that the shapes of the distributions change as the degrees of freedom change. t= (29,28) | \a=0.9 | a= 6.9) Relative frequency that it can assume an infinite [The F aabaon i continuous) ‘ibution is continuous\This means 2 ate 8 os ‘values between zero and positive infinity. jumber ® [Tae Feta ered ES} * smallest value F can assume is 0. [The 356 Chapter 12 istribution is to the right. il of the dis pe O« ssitvely skewed Ye 0 4 increases in both the numer rnumber of degrees ‘ches @ normal distribution. ang As ie ator ine distribution 2PP*o" the F distribution appro tribution {5 iis asymptotic} increase, ri d the values of ie similar ‘to the behavior of the normal he : var overaed in Chapter 7- : omy Gatrbution, describ opulation Variances on that we describe ocours“vhen we test 4, ea Comparing Two P equals the variance of anoties ic ti lication of the F distribut 77 * msuensn enti tearas normal population. The folloy ing machines al 102 conduct ates of + Two Barth Mere, should have the same mean length, We wart to @nsiie thy hypothesis to determine The bars, therhaving the same mean length they also have simila variation, in addition to having ‘© The mean rate of retuin on two types of whether the variances sae pane common stock may be the same, byt oma there may be more variation in the rate of return in one than the other. A 40 technology and 10 utility stocks shows the same mean rate of return, but there is likely more variation in the tech. nology stocks. , + Astudy by the marketing department for a large newspaper found that men and women spent about the same amount of time per day surfing the Net. However, the same report indicated there was nearly twice as much variation in time spent per day among the men than the women, The F distribution is also used to test assumptions for some statistical tests. Recall that in the previous chapter we used the t distribution to investigate whether the means of two independent populations differed. To conduct that test, we assume that the variances of two normal populations are the same. See this list of assump- tions on page 333. The F distribution is used to test if the variances of two normal populations are equal. Regardless of whether we want to determine whether one population has more variation than another population or validate an assumption for a statistical test, we first state the null hypothesis. The null hypothesis is that the variance of one normal population, o7, equals the variance of the other normal population, 3, The alternate hypothesis could be that the variances differ. In this instance the null hypothesis and the alternate hypothesis are: Ho: oF = 03 Hy 0% # 03 To conduct the test, we select a random sample of n, observations from one PoP” ulation, and a random sample of n, observations fi lation. The test statistic is defined as follows. oceans 3 [12-1] Analysis of Variance 357 ‘The terms s} and s? are the the test statistic fol respective in order to reduce the eine Alston i, anes. the ul hypothesis is tue is placed in the numerator fost? tle of wii = and n= 1 degre of freedom. tho rghi-tall real ene eee tabled Frases oer 1, TH, taled testis found by ave oY 02 require. Th always larger than 1.00, Thus, the appropriate degrees of eae elonificanns Wed a vale oe tWo- } seem Lasers Limos offers limousine service from the city hall in Toledo, Ohio, to Metro Airport in Detroit. Sean Lammers, president of the company, is considering two routes. One is via U.S, 25 and the other via 1-75. He wants to study the time it takes to drive to the airport using 4 each route and then compare the results. He collected the following sam- pple data, which is reported in minutes. Using the .10 significance level, is there a difference in the variation in the driving times for the two routes? Interstate 75. D a = e7 o 56 a 45 51 7 56 | @ a7 PARE yp The mean diving times along the two routes are nearly the same. The mean time 1a ge.29 minutes for the U.S. 25 route and 59.0 minutes along the 1-75 foute, How- ever in evaluating travel times, Mr, Lammers is also concered about the variation vet ravel times. The first step is to compute the two sample variances, We'll use armula (3-2) to compute the sample standard deviations. To obtain the sample variances, we square the standard deviations. U.S. Route 25 BOF [48549 _ 5 9oa7 =X _ 408 _ = SB os920 0 8-Vonmt 7e4 0 Interstate 75 3X _ 472 _ [x= FP a8 =5900 S=\Jn-4 ion, as measured by the ite. This is consistent wi ‘more stoplights, whereas e standard deviation ith his knowledge of the two routes; the ‘There is more variat 1-75 ig a limited-access interstate than in the I-75 rou USS. 25 route contains ie om Chapter 12 4 Js sovoral miles longer. It is ba kita ore ‘consistent, so he decides to part ta 7 highway, Howaver, difference In the nether there really is a differer Ne Varation of atte service offered be bot! cal test to determine wt = usual five-step hypotnesis-testing Proc by stating the nl hypothesis andthe alerate mooi, \iovtaled because we are looking for a diferenpe ‘Tao routes. We are not trying to show that the than the other. ind sedure will be employed, 4: We begin one The test is t variation of the has more variation Hy: 9% = 03 Heo #03 : the .10 significance level. Step 3: ideas test staistic follows the F distribution. ‘Step 4: The critical value is obtained from Appendix B.4, a portion of whigh produced as Table 12-1. Because we are conducting a two-taled tat fhe tabled significance level is .05, found by a/2 = .10/2 = .05. hers n,—1=7 ~ 1 = 6 degrees of freedom in the numerator, and n, ~ 7+ 8'- 1 = 7 degrees of freedom in the denominator. To find the value, move horizontally across the top portion ofthe F table Table 24 cor Appendix B.4) for the 05 significance level to 6 degrees of treedomiy ‘the numerator. Then move down that column to the critical value site 7 degrees of freedom in the denominator. The critical value is 8¢7 Thus, the decision rule is: Reject the null hypothesis if the ratio ofthe sample variances exceeds 3.87. sre TABLE 12-1 Critical Values of the F Distribution, a = .05 Degrees of Dogroes of Freedom for Numerator Freedom for Denominator 5 e a 7 | 230 i234 237 29 2 193 193 194 ia4 ‘I 301 8.04 889 8.85 4 626, 6.16 6.09 604 5 5.05 4.95 488 482 : 0 428 az ats Car ae ee) 2 3.69 358 3.50 a 2 348 337 329 3B fess 3.33 322 414 307 Step 5: The final step is to take the ratio of the two sample variances, detemine the value of the test statistic, and make a decision regarding the null I~ Pothesis. Note that formula (12-1) refers to the sample variances but we calculated the sample standard deviations. We need to square the sta” dard deviations to determine the variances. 2 = Sf _ 6.9047? _ s~ (aa7say 479 3 The decision is to reject the null hypothesis, because the con value (4.23) is larger than the critical value (3.87). We conclude is a difference in the variation of the travel times along the two rus ad 360 LOB Discuss the concept of analysis of variance, Using the taistribution leads toa buildup of Type l erro, Chapter 12 ven 8, The following hypotheses are 0 Hzet = of mana the frst population resuted in i tions from - A random samele of eght SbstcT observations from the ‘Second population reat deviation of 10, A andor atthe .02 significance level, Is there a diference aoe ina standard deviation of 7. ation oftho two populations? 4, The folowing hypotheses are oi Hy: 04 = 03 Hyot > oF F 1m the first population resulted in a observations from the a iene gb et en th Sfoceg standard deviation of 7. At the .01 significance level, is there me varaten show ie er bopeaten ed a study ofthe iPod listening habits of search, Inc., conducted a study I ee So paecseieioy fire for. rie wom 36 Tv uten por deyibe- otis dloviton of Sampo ofthe 10 men cluded was 10 minutes per day. The ean listening time iy 42 women studied was also 85 minutes, but the Standard dato tthe sang ies. At the 10 sileance level, can wo conlise tat thesia det the vation in he isting times for ren eat Cel Secures epertd thatthe mean rate of rt on a sap g « Sep a aan mae tre eee freuen on a sample fly stocks was 10.9 percent wih sandal aon 35 percent Ath. sigieace level, can we concie hat there ma nt the oil stocks? ANOVA Assumptions Areiter use of the F distribution is the analysis of variance (ANOVA) technique in {ich we compare three or more population means to determine whether they could be equal. To use ANOVA, we assume the following: The populations follow the normal distribution, Lo the Populations have equal standard deviations (a) 7 The populations are independent When these conditions are met, F i Why do we need to study AN ences in population means discus: is used as the distribution of the test statistic. ‘OVA? Why can't we just use the test of difer- sed in the previous chapter? We could compare compare the four training methods. Using the ¢ distribution ¢ pws: A versus B, A versus C, A versus D, B versus Cr B versus D, and C versus D. it we set the Significance level at .05, the probabilty Of a correct statistical decision ie 95, found by 1 — .05. Because we conduct sit ‘Separate (independent) tests the Probability that we do not make an incorrect decision 361 due to sampling error in any of the six inde, dent I cor = G fee minastis oo {5495)-95(. 05, e510 = 735, To fin sha ett One enor da ie tout om Thus, ihe, Probability of at least one east gdte aac hs al lin a Gatrbulen, the likelinonet of eh i: We conduct six Independent tests using inate need a ater ya, fe nba al pene oe oa that we need a better method te, Conducting six't tees to compare the treatment means simultaneous) © | of .265. It is obvious kt tests. ANOVA will allow us ly and avoid the ‘buildup of Type | # developed for application. ; terms related to that context toner, PPlications i identify the different i nN agriculture, and. many of the IN parti Populations eit Particular th to how a plot of gro 'e term treatment is used to 19 examined. Wreatment refers i Of Stound was treated wena Pentie of example, treatmen towing istration will clarify the ieee, ‘reat of ANOVA. ular type of fertilizer. The fol- ‘ment and demonstrate an application Joyce Kuhiman manay FFE fealonal financial center, She wishes to compare the pro- ductivity, as. measured by the. Number of customers, ees. Four days are randomly select employee is recorded, Served, among three employ- 'ed and the number of customers served by each The results are: ———_, Wotfe White Korosa —————White____Koros 55 6 a7 54 8 51 | 59 7 ifferent CHART 12-1 Case Where Treatment Means Are Different ations { 42-2. This would indicate quzySe) trounce moana’ Ths 8 show? again that the populations folow yah a the tie wach of the populations 12 the same, distribution CHART 12-2 Case Where Treatment Means Are the Same Jue ANOVA Test does the ANOVA test work? Recall that we want to determine whether the var. How bs 4 ious sample means came from a single population or populations with Gifferent . We actually compare these sample means through their variances, Ty explain, on page 360 we Isted the assumptions required for ANOVA. One of thos, assumptions was that the standard deviations of the various normal populations had to be the same. We take advantage of this requirement in the ANOVA test, The underlying strategy is to estimate the population variance (standard deviation squared) two ways and then find the ratio of these two estimates. If this ratio is about 1, then logically the two estimates are the same, and we conclude that the population means are the same. If the ratio is quite different from 1, then we con clude that the population means are not the same. The F distribution serves as a referee by indicating when the ratio of the sample variances is too much greater than 1 to have occurred by chance. Refer to the financial center example in the previous section. The manager wants to determine whether there is a difference in the mean number of customess served. To begin, find the overall mean of the 12 observations. It is 58, found by (65 + 54 +--+ + 48)/12. Next, for each of the 12 observations find the difference between the particular value and the overall mean. Each of these differences is squared and these squares summed. This term is called the total variation. we VARIATION The sum of the squared differences between each observation and the overall mean. ae In our example the total variation is 1,082, found by (65 — 58)? + (64 - 58)" + (48 — 58). the Next, break this total variation into two components: that which is ee treatments and that which is random. To find these two components, 1 Conduct a test of Pessamong thee population variance, from the following equatior TREATMENT VARIATION treatment mean aa ibs ‘Sur the grag oh differences betwee; ariation di n ti lue t¢ and the overall mean "S,™®2M number er teat™ents is the gach of the Ine wt Ths terms 9g Of customers the Sum ofthe squared (65 + 54 +59 + sq igntments. Th 2. To caleate Wwe teehee ee the squales’< 56)/4. The otha? MEA for Wolk it, We first find the mean of U8 0 the og Means are 70 a 56 ustomers, found by (66 — 58% + (66 — 59 is: respectively, The sum of Peg (48 ~ 59 : : os ~ 58)? + 4(70 ~ 59)? + 4(48 - 58)* th 1 treatment means, it is logical that this are similar, this iene toes alle Would be rer ilar, this term will be a small value, The other sour fee Of variatior enor Somnann nis referred to as the ri fandom component, or the RANDOM VARIATION The sum of the observation and its treatment mean Serenees between each In the example this term is is the sum of the squared and the mean for that particular employee. ‘the eerie Bee oot 7 (65 — 56)? + (64 - S6y +-+-+ (48 - 48)° = 90 We determine the test statistic, which is the ratio of the two estimates of the nreteatment : . Oo” Estimate of the population variance 10) * _, _ based onthe citterences among the sample means eo Estimate of the population variance ( based on the variation within the sample 1 variance is based on the treatments, that It is 992/2. Why did we divide by 2? Recall [see formula (2-9), we divide by the rum sre are three treatments, so We divide Our first estimate of the populatior is, the difference between the means. from Chapter 3, to find a sample variance ber of observations minus one. In tis case the timate of the population variance is 992/2, a 2 2 arenes estimate within the treatments is the random variation divided by me tata number of abservations fess the nti, of weatments. That 8 20(12 — 3 the total number © “etimate of the population vases cop, This i actualy 3 oneraization cond aul (11-5, where we pooled the ‘sample variances from two ulations. Populatcrt step is to take the ratio of th 992/2 _ 49.6 F= 99 conclude 1 diferent from 4, W8 Ct Crean nur “There is jese two estimates. that the treatment : : stom Because this ratio is aut at he vosiert means are not the Sar mployees. i sere by th Spl, ch ee samples of different Si205- RET) Chaptor 12 meals and snacks during flight vices, such A group of four carriers (wo ran eb baggage. + runner Marketing Research, ing, to ly airlines have cut Sef hired 81 Recent! for checking ba started charging {7 Confidentiality) Mm evel of satisfaction with a recent fig historical names gers regard erg, poarding, in-ight service bast irvey recel ticketing. D » Baga Suey esr ncued sto SS orth. Twenty ~NO uestons ofa 200856 Fealing, lot communal, 4, jar, of oot, A response excellent of possible answors! orf 23, fair a 2, and poor a 1, ene fesponies were t given a score of 4, 900d 8 1 ingication of the Stein with the fight Givsiad, 80 the total score WaS ATT \a) of satisfaction with the service, The The greater the score, the hig .d passengers from the four aitines, } 100. highest possible score W2S 107. yey selected 2Mhere a difference in the mean Satisfaction leg the .01 significanc Brunner randomly Below is the sample informatis among the four airlines? e level. We will use the five-step hypothesis testing procedure. Step 1: State the null hypothesis and the alternate hypothesis. The nul hypothesis is that the mean scores are the same for the four airlines Hoe hy = By = By = Ba . The alternate hypothesis is that the mean scores are not all the for the four airlines. Bm same H,: The mean scores are not all equal. We can also think of the alternate hypothesis a& “at least two mean scores are not equal.” If the null hypothesis is not rejected, we conclude that there is no dif- ference in the mean scores for the four airlines. If H, is rejected, we con clude that there is a difference in at least one pair of mean scores, butt this point we do not know which pair or how many pairs differ. elect the level of significance. We selected the .01 significance ev etermine the test statistic. The test statistic follows the F distribution. “ormulate the decision rule. To determine the decision rule, we Ne crical value. The critical value forthe F statistic is found in Append Be The critical values for the .05 significance level are found on the frst va and the .01 significance level on the second page. To use this table were toknow the degrees of freedom in the numerator and the denomina degrees of freedom in the numerator equals the number of treat ert Oe ignated as k, minus 1. The degrees of freedom in the denominalat i total number of observations, n, minus the number of treatments. problem there are four treatments and a total of 22 observations: Degrees of freedom in the numerator = k-1=4-17$ ‘é Degrees of freedom in the denominator = — k = 22 ~ a Step Step a SS total ~ EO Kp ;X is each sample observation. Xz is the overall ot grand mean, Next dclermine SSE or the sum ef the Squared errors, This ts the sum of the ‘Squared diffsronces between each ‘observation and its respective treatment mean, The formula for finding SSE Is: SSE = SK -—XpP [12-3] where: %, is the sample mean for treatment c. detailed calculations of SS total and SSE for this example fallow. To deter- iw tia values of SS total and SSE we start by calculating the overall or grand mean. There are 22 observations and the total is 1,684, so the grand mean is 75,64, — = 75.64 Allegheny Ozark Total 70 6 % 7 % 2 B 55 » u at 65 65 50 ata 1,664 | + ee 72.06 a0 | (75.64

You might also like