0% found this document useful (0 votes)
39 views24 pages

Tests of Hypotheses

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views24 pages

Tests of Hypotheses

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 24

CHAPTER 6

Tests of Hypotheses

The objective of most statistical investigations is inference--that is, making


decisions or predictions about a population based on information in a sample. The
knowledge of random variables and their probability distributions enables us to
construct theoretical models of populations. The probability distributions of some
random variables (Z, T, X2, F) will be used in this chapter to do tests of hypotheses
about the population.

6.1 Terminology

To actually make the decision or prediction about a population, we use the


sample data to compute sample statistics (quantities calculated from sample
observations), such as the sample mean or variance. Because sample measurements
are observed values of random variables, the value that we compute for a sample
statistic will vary in a random manner from sample to sample. Since sample statistics
are random variables they also possess probability distributions that are either discrete
or continuous, as discussed in Chapter 5. These probability distributions, called
sampling distributions because they characterize the distribution of values of the
various statistics over a large number of samples, are used to evaluate the reliability of
inferences made using statistics. In this chapter, we will learn how to use sample
statistics to make inferences about the population. That is, we will estimate or make
decisions about population means and other population values/ characteristics based
on one or more samples selected from one or more populations and then use the
sampling distribution of a sample statistic to assess the uncertainty associated
with an inference.

Other important terms to learn:

Statistical hypothesis - an assumption about the value of one or more


population parameters; assumed value of population parameter.

Test of a statistical hypothesis - procedure for determining whether a


hypothesis can be rejected; determining the validity of an assumption; choosing
between two competing hypotheses about the value of a population parameter.

Tests of significance - statistical procedures which lead to decisions.

6.2 Types of Hypotheses

Null vs. Alternative Hypotheses

Null hypothesis (Ho) - a hypothesis of no differences, of independence, or


assumed value of parameter; usually formulated for the purpose of being
rejected. (Null means invalid, void, or amounting to nothing.)

Alternative hypothesis (Ha) - the operational statement of the researcher's


hypothesis and constitutes the assertion of being accepted if Ho is
rejected; opposes the null hypothesis and that we wish to establish as true.
Note: Ho and Ha are mutually exclusive.

Simple vs. Composite Hypotheses

Simple hypothesis - a hypothesis specifying only one value of the population


parameter.

Composite hypothesis - a hypothesis specifying a range of values that the


population parameter may assume.
58

Examples:
One-sample test Two-sample test

Ho: μ = 100  simple Ho: μ1 = μ2  simple


Ha: μ  100  composite Ho: μ1 - μ2 = 0, simple
Ha: μ > 100  composite Ha: μ1  μ2  composite
Ha: μ < 100  composite Ha: μ1 > μ2  composite
Ha: μ1 < μ2  composite

One-sided vs. Two-sided Tests

One-sided test - a statistical test in which Ha specifies that the population


parameter is entirely above or entirely below the value specified by
Ho; also called a one-tailed test and a directional test.

Two-sided test - a statistical test in which Ha specifies the


population parameter can lie on either side of the value specified by
Ho; also called a two-tailed test and a nondirectional test.

Examples:

One-sided tests:
Ho: μ = 100 Ho: μ = 100 Ho: μ1 - μ2 = 0 Ho: μ1 - μ2 = 0
Ha: μ > 100 Ha: μ < 100 Ha: μ1 - μ2 > 0 Ha: μ1 - μ2 < 0
Ha: μ1 > μ2 Ha: μ1 < μ2
Two-sided tests:
Ho: μ = 100 Ho: μ1 = μ2 or Ho: μ1 - μ2 = 0
Ha: μ  100 Ha: μ1  μ2 or Ha: μ1 - μ2  0

Steps for selecting Ho and Ha:

1. Specify the hypothesis you wish to support. Remember this will give a
range of possible values for the parameter being tested and will be
expressed as an inequality in the alternative hypotheses.
Example: Ha: μ > 75

2. Define the opposite of Ha.


Example: μ  75

3. For Ho, choose the value of the parameter that is nearest in value to those
specified in Ha.
Example: μ = 75
 Ho: μ = 75.

6.3 Decision Problem in Hypothesis Testing

1. We choose between two mutually exclusive propositions, (Ho vs. Ha) about
a population parameter.
2. We use the sample to infer about the population parameter. Thus, we are
faced with the uncertainty inherent in sampling from a population with only
the sample evidence on which to base the choice of accepting Ho.

3. We solve this problem by first assuming that H o is true. Then, using


probability theory, we establish the criteria that will be used to decide
whether there is sufficient evidence to declare Ho is false. The sample
evidence (through a sample statistic) is compared to the criteria, and the
decision is made whether to accept or reject H o. (We attempt to gain
59

support for the research (Ha) hypothesis by producing evidence to show


Ho is false.)

4. We reject Ho on the basis of only a reasonable doubt about its truth. Thus,
the probability value upon which we base our conclusion that there is
reason to doubt the truth of H o, is critical. (We attempt to show that the
sample is inconsistent with Ho: If Ho is true, the sample represents a rare
event.)

6.4 Conclusions and Consequences for a Test of an Hypothesis

Ho is true Ho is false
Accept Ho Correct decision Type II error

Reject Ho Type I error Correct decision

P(Type I error) = P(Reject Ho|Ho is true) = α, level of significance of the test


test

If P(Type I error) ≤ α, reject Ho since α is very small.

1 - P(Type I error) = P(Accept Ho|Ho is true) = 1 - α, confidence level

P(Type II error) = P(Accept Ho|Ho is false) = β  increases if the actual value of


parameter is closer to the hypothesized value.

1 - P(Type II error) = P(Reject H o|Ho is false) = 1 - β, power of statistical test


since it indicates the ability of the test to recognize correctly
that H0 is false and hence, that Ho should be rejected.

Note:

1. Rather than compute the actual probability of Type I error researchers


conventionally establish the level of significance before hand, by
considering the consequences of making a Type I error. The two most
frequently used levels of significance are 0.05 and 0.01. When α = 0.05,
the researcher is willing to accept a 5% chance of being wrong when H o is
rejected. When α = 0.001, 0.005, the researcher is very concerned about
accepting an incorrect hypothesis (Ha). When Ho is rejected at the 0.05
level of significance, the result is said to be "statistically significant at the
0.05 or 5% level" and if rejected at the 0.01 level of significance, the result
is "highly significant at the 0.01 or 1% level."

2. For a fixed sample size n, decreasing α would mean increasing β. The


only way to decrease both types of error is to increase the sample size n.

3. If we don't reject Ho, we don't conclude Ho because then we risk the


possibility of a Type II error without knowing the value of β. Since we will
not calculate β, we simply conclude that insufficient evidence exists to
refute the null hypothesis. Withhold judgment and seek a larger sample
size to lead you closer to a decision. Or, estimate the parameter by using
a confidence interval. This will give an interval estimate of its true value
and give you a measure of reliability of your inference.
60

6.5 Concept of a Test Statistic and Critical Region

Test statistic - a random variable used to determine how close a specific sample
result falls to one of the hypotheses being tested.

Examples: Z, T, X2, F statistics

Values of the test statistic can be classified in two sets:

1. Critical region - or rejection region, the set of values of the test statistic that
will lead to rejection of Ho.

2. Acceptance region - the set of values of the test statistic that will lead to
nonrejection of Ho.

Critical value of the test statistic - that value which separates the critical region
from the acceptance region.

Example: Suppose the test statistic is Z and the criterion used to decide
whether to reject or not reject Ho is:

If P(Z  zα)  α, reject Ho.

The critical region is the set of values in the interval [zα, ].

Thus:
If calculated z  zα, reject Ho. If z > zα, P(Z  zα) < α.
If z = zα, P(Z zα) = α.

Observed significance level (or p-value) - for a specific statistical test is the
probability (assuming Ho is true) of observing a value of the test statistic that is at least
as contradictory to the null hypothesis, and supportive of the alternative hypothesis, as
the one computed from the sample data; the probability of observing a value as large
as that computed based on sample evidence assuming that the null hypothesis is true.

Example: Suppose the test statistic is Z and the computed value of Z from the
sample observations is 3.67. The p-value for this test will be determined as follows:

P(Z > 3.67 when Ho is true)


= 1 - P(Z < 3.67 when Ho is true)
= 1 - 0.9999
= 0.0001

This means that it is not likely to observe a z-value as large as 3.67 if Ho is


true. This will lead us to reject the null hypothesis at the 0.0001 level of
significance.

6.6 Tests of Hypotheses on the Population Mean

A.1 Test of hypothesis about μ when σ2 is known

1. Ho: μ = μo, μo = hypothesized value of the mean

a. Ha: μ > μo
b. Ha: μ < μo
61

c. Ha: μ  μo

2. α = 0.05 or 0.01, etc. P(Reject Ho|Ho is true)  α.

3. Test statistic: Z

When is normal or approximately normal with mean μx = μ


and standard deviation δx = δ/ then under Ho (μ = μo)

xμ
z is a Z value, Z ~ N(0, 1)
σ/ n

We calculate x from the sample and standardize it to a z


value to decide whether is close enough to μo to accept Ho. (The
z-score expresses the distance between and μo in units of s.d.s of
x (σx).)
4. Critical Region and Decision on Ho
a. Reject Ho if z  zα; P(Z  zα)  α

Probability of observing z when Ho is true is very small so we


reject Ho since Ho is not likely to be true.

Note:
P(Z  zα) = 1 - α

At α = 0.01:
P(Z  z0.01) = 1 - 0.01 = 0.99
P(Z  z0.01)  P(Z 2.33) = 0.99  z0.01 2.33

At α = 0.05:
P(Z  z0.05) = 1 - 0.05 = 0.95
P(Z  1.645) = 0.95  z0.05 = 1.645

b. Reject Ho if z  - zα; P(Z  -zα)  α. Or, reject Ho if |Z|  Za.

Note:
-zα is just the negative of zα
-z0.01 = -2.33
-z0.05 = -1.645

c. Reject Ho if z  zα/2 or z  -zα/2 (or Reject Ho if |z| zα/2)

Note:
At α = 0.01: (α/2 = 0.01/2 = 0.005)
P(Z > z0.005) = 0.005
P(Z < z0.005) = 0.995
z.005 = 2.575

At α = 0.05: (α/2 = 0.05/2 = 0.025)


P(Z > z0.025) = 0.025
P(Z < z0.025) = 0.975
z0.025 = 1.96

5. Conclusion:
62

Reject Ho at α level of significance.


or
Do not reject Ho. There is insufficient evidence to reject Ho.
Data do not provide sufficient evidence to support the
research hypothesis.

Example:

A statistics professor would like to assess the impact of spending twice as


much time as before on probability theory, taking some time away from that
devoted to other core statistical material. Should he find that this results in a
score improvement on a certain standardized test, he will adopt the new policy in
the future.

This policy was tried on a class of 25 students and at the end of the
semester, the students were given the standardized test. Over the years, the
professor had established that the scores on the test are normally distributed
with a mean of 75 points and a standard deviation of 15 points. The sample
results show that x = 86 points. Should the new policy be adopted? Use α =
0.01.

Solution:
1. Ho: μ = 75 (Policy does not work.)
Ha: μ > 75 ( Policy works.)

2. α = 0.01, P(reject Ho|Ho is true)  0.01

3. Test statistic: Z

X = score on standardized test of a student

X ~ Normal (75, 152)  X Normal (75, 152/25)

If Ho is true:

4. Reject Ho if z 2.33  reject Ho since 3.67 > 2.33.

Note: P(Z > 3.67) when Ho is true = p - value


P(Z < 3.67) = 0.9999
P(Z > 3.67) = 0.0001 = p-value, probability of observing a value as
large as 3.67 when Ho is true

Since p-value < α  reject Ho.

5. Conclusion: The new policy should be adopted at the 1% level of


significance. (Using the p-value, the new policy should be adopted at
the 0.01% level of significance.)
63

A.2 Test of hypothesis about μ when σ2 is unknown. (σ is estimated by s)

A.2.1 If n is large (n  30), test statistic is Z, regardless of the distribution of X.

x
Z as in A.1
s/ n

A.2.2 If n is small (n < 30), test statistic is T, if distribution of X is not too different
from normal.

x
t with n – 1 degrees of freedom
s/ n

Critical Region:

a. H a :   o
Reject Ho if t  t,n-1
b.
Reject: H o if t   t  , n 1

c. Ha :   o

Reject H o if t  t  / 2, n 1 or t   t  / 2, n 1 ; or
Reject H o if t  t  / 2, n 1

Example:

Beta Company is manufacturing steel wire with an average tensile strength


of 50 kilos. The laboratory tests 16 pieces and finds that the mean is 47 kilos
and the standard deviation is 15 kilos. Are the results in accordance with the
hypothesis that the population mean is 50 kilos? Use α = 0.01.

Solution: Assuming tensile strength is normally distributed since n < 30.

1. Ho: μ = 50 kilos (Sample came from population with mean of 50 kgs.)


Ha: μ < 50 kilos (Sample came from population with mean less than 50
kgs.)

2. P(Reject Ho|Ho is true)  0.01

3. Test statistic: T (n < 30)

X = tensile strength of steel wire


X ~ Normal (50,  / 16)
2

x 47  50
t    0. 8
s/ n 15 / 4

4. Reject Ho if -0.8  - t.01,15 = -2.602


Decision on Ho: Do not reject Ho.

5. Conclusion: There is no sufficient evidence to indicate that the results are


not in accordance with the hypothesis that the population mean is 50 kilos.

Note: p-value = P(T < -0.8) when μ = 50, v = 15


= P(T > 0.8), v = 15
64

> 0.10 since P(T > 1.341) = 0.10

6.7 Tests of Hypotheses on the Variances of Normal Populations

A.1 Test of hypotheses about σ2

Ho:    0 ,  0 is the hypothesized value of the variance


2 2 2
1.

a. H a :  2   02

b. H a :  2   02

c. H a :  2   02

2. α = 0.05, 0.01, etc.

3. Test statistic: X2

(n  1)s 2
x2 = 2 , v = n - 1; s2 is the variance of a random sample taken from a
0
normal population having a variance of σ2

4. Critical region and decision on H0:

Reject Ho if X  X a , v ; P(X  X a )  
2 2 2 2
a.

Reject Ho if X  X1  , v ; F(X  X1  , v )  


2 2 2 2
b.

Reject Ho if X  X1  / 2, v or X  / 2, v
2 2 2
c.

5. Conclusion at α x 100% level of significance.


65

Example:

A telephone company is continually studying the length of phone calls as


well as the variability in lengths. Suppose that the national population variance is
4 minutes squared. The telephone company wants to test whether a certain
community's calls differ in variability from the national value. The length of calls is
assumed to be normally distributed. What would be your decision if a random
sample of 25 resulted to a variance of 2.5? Use α = 0.05.

Solution:

Ho: σ2 = 4 min. squared


Ha: σ2  4 min. squared

α = 0.05

Critical values: x20.025,24 = 39.36, x20.975,24 = 12.40

Test statistic value:

( 24)( 2.5)
x2   15
4

Decision on Ho: Since 15 is not greater than 39.36 nor less than 12.40, do not
reject Ho

Conclusion: There is no sufficient evidence to indicate that the


community's calls differ
in variability from the national value.

Note: A one-tailed test with Ha: σ2 < 4 min. squared can also be used since
the sample evidence supports this. This way, only one critical value need
to be identified.

A.2 Test of equality of two variances

1. Ho: σ21 = σ22


a. Ha: σ21 > σ22
b. Ha: σ21 < σ22
c. Ha: σ21  σ22

2. α = 0.01, 0.05, etc.

3. Test statistic: F

Assuming two independent random samples from normal populations


having variances of σ12 and σ22, respectively.

s12
f = 2 , under Ho with v1 = n1 - 1, v2 = n2 - 1
s2
66

4. Critical region and decision on Ho


a. Reject Ho if f  fα (v1, v2)

b. Reject Ho if f  f1-α (v1, v2)

c. Reject Ho if f  fα/2 (v1, v2) or f f1-α/2 (v1, v2)

5. Conclusion

Example:

Two sections of a statistics course took the same long exam. A sample
was randomly drawn from each section showing the following scores:

A: 58, 65, 68, 80, 76, 82, 90, 88, 93, 88


B: 63, 78, 80, 91, 85, 60

Assuming that the scores are normally distributed, is there reason to believe that
the population variances are equal? Use α = 0.05.

Solution:

Ho:  2A   2B
Ha:  A2   B2

α = 0.05

Test statistic value under Ho:

s A2 139.5111
f    0.93
s B2 150.1667

1 1
Critical values: f0.025 (9, 5) = 6.68 and f0.975 (9, 5) =  = 0.22
f 0.975 (5.9) 4.48

Decision on Ho: Since 0.93 is in the acceptance region, do not reject Ho. There is no
sufficient evidence to indicate that the variances are not equal.

Note: For practical purposes, it is best if one uses the one-tailed test biased to f > 1, or

with the numerator larger than the denominator. This way, the critical value is

easily found. This simply means computing the f ratio such that the larger
sample variance is the numerator.
6.8 Tests of Hypotheses on the Equality of Means

A.1 Test of equality of means when σ12 and σ22 are known, independent samples

1. Ho: μ1 = μ2 or Ho: μ1 - μ2 = 0 or Ho: μ1 - μ2 = do


a. Ha: μ1 > μ2 or Ha: μ1 - μ2 > 0 or Ha: μ1 - μ2 > do
b. Ha: μ1 < μ2 or Ha: μ1 - μ2 < 0 or Ha: μ1 - μ2 < do
c. Ha: μ1  μ2 or Ha: μ1 - μ2  0 or Ha: μ1 - μ2  do

2. α = 0.05, α = 0.01, etc.; P(Reject Ho|Ho is true)  α


67

3. Test Statistic: Z

If 2 independent random samples of size n1 and n2 are drawn from


large or infinite populations, discrete or continuous, with means μ1 and μ2,
and variances σ12 and σ22 , then

  2  2 

X1  X 2 approximately normal   1   2 , 1  2 
n1 n 2
 
 x1  x 2    μ 1  μ 2 
Z under H o is the standard normal Z
σ 12 σ 22

n1 n2

4. Critical Region and Decision on Ho (as in A.1 in Section 6.6)


5. Conclusion (as in A.1 in Section 6.6)

Example:

The television picture tubes of manufacturer A have a s.d. of 0.9 year,


while those of manufacturer B has a s.d. of 0.8 year. A random sample of 36
tubes from manufacturer A yielded a mean lifetime of 6.5 years while a random
sample of 49 tubes from manufacturer B yielded a mean lifetime of 5.0 years. Is
there reason to believe that manufacturer A's picture tubes have a mean lifetime
greater than those of manufacturer B? Test at α = 0.05.

Solution: The two samples can be taken to be independent and coming from
large populations.

 σ2 σ2 
X A  X B approximately Normal  μ A  μ B , A  B 
 n A n B 

1. Ho: A - B = 0
Ha: A > B or A - B > 0

2. P(Reject Ho/Ho is true)  0.05

3. Test statistic: Z

( x A  x B )  (μ a  μ B ) (6.5  5.0)  0 1.5


Z    7.954  7.95
σ 2A σ B2 0.81 0.64 0.1885769
 
n A nB 36 49

4. Reject Ho if 7.95 > Z.05 = 1.645  reject Ho at =0.05


Note: P(Z > 7.95) < 0.05

5. Conclusion: There is reason to believe that manufacturer A’s picture tubes have a
mean lifetime greater than those of manufacture B at the 5% level of
significance.
68

A.2 Test of equality of means when σ21 and σ22 are unknown and σ21 σ22 based on
test on equality of variances; independent samples (we estimate σ21 and σ22 by s22
and s22, respectively)

A.2.1 If n1 30 and n2 30, test statistic is Z.

( x 1  x 2 )  ( 1   2 )
Z
s 12 s 22 , as in A.1’s C.R., Section 6.6

n1 n2

A.2.2. If n1 < 30 and n2 < 30, test statistic is T. Assume X1 and X2 are both

Normal.

 x 1  x 2 )  ( 1   ) 
t
s12s2 with v d.f.
 2
n1 n 2

2
 s 12 s 22 
  
n 
 1 n2 
v 2
s s2
( 1 )2 ( 2 )2
n1 n
 2
n1  1 n 2  1

As in A.2.'s critical region of Section 6.6.

A.3 Test of equality of means when 12 and  22 are unknown and 12 =  22 based
on test of equality of variances. (We estimate 12 by s 12 and  22 by s 22 .)

A.3.1 If n1  30 and n2  30, test statistic is Z.

 x 1  x 2 )  (μ 1  μ 2 
Z
1 1
sp 
n1 n 2

(n 1  1)s 12  (n 2  1)s 22
s p2  , pooled variance
n1  n 2  2

From:

σ2 σ2
σ x1  x2   , σ 12  σ 22  σ 2
n1 n 2

1 1
= σ2(  )
n1 n 2
69

1 1
=  
n1 n 2

1 1
σ̂ x1  x 2  s p  since 
ˆ  sp
n1 n 2

As in A.1's critical region of Section 6.6.

A.3.2. If n1 < 30 and n2 < 30, test statistic is T. Assume X1 and X2 are both
Normal.

 x1  x 2    μ 1  μ 2 
t with v  n 1  n 2  2
1 1
sp 
n1 n 2

As in A.2.'s critical region, Section 6.6.

Examples:

1. On an elementary school examination in spelling, the mean grade of 32


randomly selected boys was 72 with a standard deviation of 8 while the
mean grade of 36 randomly selected girls was 75 with a standard deviation
of 6. Test the hypothesis that the girls are better in spelling than the
boys. Use α = 0.05.

Solution:

Assumption: The 2 sets of observations are independent random samples.


Unequal population variances: σ2B σ2G (based on test)

Ho: μB = μG
Ha: μG > μB

Test statistic: Z (n1 and n2 are large)

(xG  x B )  0 75  72 3
Z    1.73
2
sG s2 36 64 1.7321
 B 
nG nB 36 32

Critical value: z0.05 = 1.645


Decision: Since z > z0.05, reject Ho at α = 0.05.

Conclusion: The girls are better in spelling than the boys at the 5% level of
significance.

2. A researcher wants to study the effect of type of binder on the burning


duration of coconut husk charcoal briquette. Using the same amount of
binder, he tested the burning duration of coconut husk charcoal briquettes
with cassava starch and cassava powder as binder. He obtained the
following results with 8 briquettes observed per type of binder. Is there a
difference in burning duration at α = 0.01?

Type of binder Burning duration of briquette (min) s


70

Cassava starch 161 166 170 165 172 168 157 164 165.375 4.8385
Cassava powder 152 147 157 146 155 157 148 155 152.125 4.5493

Solution:

Assumptions:
1. independent random samples
2. sampled populations are normal (since sample sizes are small)

Equal population variances (based on test)


Ho: μs = μp
Ha: μs  μp

Test statistic: T (n1 and n2 are small)

( x B  x p )  0 165.375  152.125
t   5.643
1 1 1 1
sp  4.6961229 
n1 n 2 8 8

2 (n 1  1)s 12  (n 2  1)s 22 ( 7)( 23.410714  20.696428


sp  
n1  n 2  2 882

= 22.053571

Critical value: t0.005,14 = 2.977

Decision: Since t > tα,v, reject Ho at α = 0.01.

Conclusion: There is a difference in burning duration at the 1% level of


significance. The burning duration of briquettes using cassava starch
as binder appears to be longer.

Note: A one-tailed test can be used to confirm this direction.

A.4 Test of equality of means when observations are paired

Ho: μ1 = μ2 or μD = 0

a. Ha: μ1 > μ2 or μD > 0 c. Ha: μ1  μ2 or μD  0


b. Ha: μ1 < μ2 or μD < 0

Test statistic: T (Assumption: X1 and X2 are both Normal)

d  μD
t with v  n  1 , n pairs of observations
sd
n

(d i  d ) 2 nd 12   d i  2
s d2   , d i  x1i  x 2i
n 1 n(n  1)

Σd i
d
n
71

Σ( x 1i  x 2 i )
=
n

= x1  x 2

t = z if n is large

Critical region: as in A.2.2/A.1 of Section 6.6

Example:

To determine whether membership in a fraternity is beneficial or


detrimental to one's grades, a pair of twins--one a fraternity(F) member, the other
a non-fraternity(NF) member--taking the same degree program were observed
over a period of 5 years. Their grade point averages were recorded as follows:

1 2 3 4 5 x
F 2.0 2.0 2.3 2.1 2.4 2.16
NF 2.2 1.9 2.5 2.3 2.4 2.26

Assuming that the populations are normal, test at the 0.025 level of significance
whether membership in a fraternity is detrimental to one's grades.

Solution: Let μ1 and μ2 be the mean grades of fraternity and non-fraternity


students, respectively.

Ho: μ1 = μ2 or μD = 0
Ha: μ1 < μ2 or μD < 0

α = 0.025

Critical region: T < -2.776

Test statistic: T

d  μD  0.1  0
t   1.581
sd 0.1414214
n 5

d  x F  x NF  0.1

2 ( 5)(0.13)  ( 0.5) 2
sd   0.02 , s d  0.14142135623
( 5)( 4)

Decision: Since | t | is not greater than t.025,4 = 2.776, do not reject Ho.

Conclusion: There is no sufficient evidence to indicate that membership in


fraternities is detrimental to one's grades.

6.9 Confidence Intervals on μ and σ2

Def’n. A confidence interval is an interval estimate for a parameter (say μ or σ 2)


in the form (a, b) where a and b are called the confidence limits and are
72

computed based on sample values and desired confidence level (say, 95% or
99%).

The interval computed from a particular sample is called a 100 (1 - α)%


confidence interval. The fraction 1 - α is called the confidence coefficient and α
is called the level of significance.

A.1 Confidence interval for μ (σ is known)

100(1 - α)% C.I. on μ, σ is known: (Assumption: is approximately normally


distributed)

σ σ
μ  (X  zα / 2  , X  zα / 2  )
n n
Proof:

P(-zα/2 < Z < zα/2) = 1 - α, acceptance region

Xμ
P(  z a / 2   zα / 2 )  1  α
σ
n

σ σ
 zα / 2   X  μ  zα / 2 
n n

σ σ
 X  zα/ 2   μ   X  z α / 2 
n n

σ σ
P( X  z α / 2   μ  X  zα/ 2  )  1 α
n n

Interpreting a 95% C.I. on μ:

In repeated sampling, if the 95% C.I. were constructed for all the sample
means, 5% of these intervals would not contain μ, 95% will cover μ.

We don't know whether an interval constructed will contain μ but we know


that 95% of the constructed C.I.'s will contain μ. Hence, we are 95% confident
that μ lies in this interval.

A.2 Confidence interval on μ(σ is unknown)

100(1 - α)% C.I. on μ: (assuming that X is normally distributed)

S S
μ  (X  t α / 2 v  , X  t α / 2v  )
n n
Proof:

Xμ
P(  t α / 2,n 1  T  t α / 2,n 1 )  1  α , T 
S/ n

S S
P( X  t α / 2 , v   μ  X  t α / 2, v  )  1α
n n
73

Note: If n is large, X is approximately normal so that μ is as in A.1 of


Section 6.6 regardless of the distribution of X.

Using confidence interval in hypothesis testing:


Construct an interval on the parameter at α. If the interval contains the
parameter value under Ho, do not reject Ho. If it doesn't, reject Ho at α.

Examples:

1. Example on test on μ, σ2 is known (Ho: μ = 75)

99% C.I. on μ: 95% C.I. on μ:

σ σ
X  z 0.01 / 2  X  z 0.05 / 2 
n n

 15   15 
86  ( 2.575)  86  (1.96) 
 5   5 

86  7.725 86  5.88

  (78.275, 93.725)   (80.12, 91.88)  Reject Ho at


 = 0.01, 0.05.

2. Example on test on μ, σ2 is unknown (Ho: μ = 50)

99% C.I. on μ:

s
47  t 0.01 / 2,15 
n

 15 
47  ( 2.947) 
 4 

47  11.05125

  (35.95, 58.05)  Do not reject Ho.

B. Confidence Interval on σ2

100(1 - α)% c.I. on σ2, X N(μ, σ2)

 (n  1)S 2 (n  1)S 2 
σ2   , 2 
 X2 X 1 a / 2, v 
 α / 2 , v 

Proof:

P( X 12 a / 2, v  X 2  X α2 / 2, v )  1  α .
74

 (n  1)S 2 (n  1)S 2  (n  1)S 2


P  σ 2
  1  α , X 2

 X2 X 12α / 2, v  σ2
 α / 2, v

Example: Test on σ2: (Ho: σ2=4 minutes squared)

95% C.I. on σ2

(n  1)s 2 ( 24)( 2.5)


lower limit =   1.52, x α2 / 2, v  x 02.005, 24
x α2 / 2, v 39.36

(n  1)s 2 ( 24)( 2.5)


upper limit    4.84, x 12 α / 2, v  x 02.975, 24
x 12 α / 2, v 12.40

 2  (1.52, 4.84)  Do not reject Ho.

6.10 Other X2 Tests

In addition to the X2 test for the population variance in Section 6.7, there are
two other uses of the X2 test included in this chapter.

A. X2 Test for Independence (X2 Test for k independent samples)


75

Use:To determine the significance of the differences among k independent


groups.

Data: r x k contingency table with rk classes


r levels of one factor (first classification criterion)
k levels of the other factor (second classification criterion)

Example: An agricultural economist studying factors affecting the


adoption of high yielding varieties (HYV) of rice wishes to
know if adoption is affected by tenure status of farmers.

Tenure status: owner operator, share-rent farmer, fixed-rent farmer

Adoption status: adoptor, nonadoptor

3 x 2 contingency table, 3 = r, 2 = k; 6 classes

2 independent groups: adoptors and nonadoptors (two independent


random samples from two populations)

Ho: The k independent samples do not differ among themselves with respect
to the other factor. or

The two factors are independent.

Ha: The two factors are related.

Example:

Ho: The two groups of farmers do not differ among themselves in terms of
tenure status. or

A farmer’s adoption of HYV's is independent of his tenure status.

Ha: A farmer’s adoption of HYV’s depends on his tenure status.

α = 0.05, 0.01, etc.

Test statistic: X2

r k (o ij  e ij ) 2
x2   e ij
, df  (k  1)(r  1)
i 1 j1

where:

oij = observed frequency in the (i, j)th cell

eij = expected frequency in the (i, j)th cell under Ho (i.e., if the two
factors are independent)

(ri )(k j )
e ij 
n

ri = ith row total


kj = jth column total
n = total number of observations
76

Derivation:

eij = P (an element belongs to the ith level of the first factor and jth level of
the
second) x n

 k j  r 
=   i   n
n  n 
 

Note:

1. If 2 x 2, n is between 20 and 40, test may be used if all eijs 5.


2. If n < 20, test is not applicable.
3. When df > 1 (either k or r is > 2), fewer than 20% of the cells should have
eij < 5, and no cell should have an expected frequency of < 1.
4. If these requirements are not met, one may combine adjacent
categories to increase the eijs to satisfy the requirements.
Otherwise, the results of the test are meaningless.
5. In a 2 x 2 table, when df = 1, apply Yate's correction factor for continuity
when 5 < eij < 10.

r k (| o ij  e ij | 0.5) 2
X 2 (corrected)   e ij
i 1 j1

Justification: The continuous X2 distribution approximates the discrete sampling


distribution of X2 very well provided that df > 1. A correction is then
applied when df = 1.

Critical region and decision on Ho:

Reject Ho if X2 X2α,df df = (r – 1)(k – 1)

Note: P(X2 X2α,v) α under Ho.

Example:

No. of Farmers
Tenure Status Total
Adoptor Nonadoptor
Owner Operator 102 26 128
Share-rent farmer 42 10 52
Fixed-rent farmer 4 3 7
Total 148 39 187

Ho: The two groups of farmers do not differ among themselves in terms of
tenure status. Adoption and tenure status are independent.(Ratio of
adoptor to nonadoptor remains the same for all three tenure status.)

Ha: The two groups of farmers differ among themselves in terms of tenure
status. Adoption and tenure status are related. (Ratio of adoptor to
nonadoptor does not remain the same for the three tenure status.)

α = 0.05

Test Statistic: X2
77

r1 k 1 (128)(148)
e 11    101.3
n 187

r1 k 2 (128)(39)
e12    26.7
n 187

r2 k 1 (52)(148)
e 21    41.2
n 187

r2 k 2 (52)( 39)
e 22    10.8
n 187

r3 k 1 (7 )(148)
e 31    5.5
n 187

r3 k 2 ( 7)( 39)
e 32    1.5
n 187

Note: Test requires that fewer than 20% of the 6 cells should have an eij < 5.

20% of 6 = 1.2

Fewer than 1 cell or no cell should have an eij < 5.

Since e32 < 5, there is a need to merge adjacent categories. Merging the
share-rent and fixed-rent categories reduces the table to a 2 x 2.

Tenure Status Adoptor Nonadoptor Total


Owner operator 102 26 128
“Renter” 46 13 59
Total 148 39 187

e 11  101.3, e 12  26.7

r2' k 1 (59)(148)
e '21    46.7
n 187

r2' k 2 (59)( 39)


e '22    12.3
n 187

2 2 (o ij  e ij )2 (102  101.3) 2 (26  26.7 )2 ( 46  46.7 )2  13  12.3  2


x2   e ij

101.3

26.7

46.7

12.3
i  1j  1

4.837117 x 10-3 + 0.018352 + 0.0104925 + 0.0398374

0.074

Critical region and decision on Ho:

Since x2=0.074< x20.05,1 = 3.84, do not reject Ho.


78

Conclusion: There is no sufficient evidence to indicate that the two factors are
related.

B. X2 Goodness-of-fit Test

Use: To determine if a population has a specified theoretical distribution.

1. Ho: Distribution provides a good fit,


Ha: Distribution does not provide a good fit.

2. α = 0.05, 0.01, etc.

3. Test statistic: X2

k (o i  e i ) 2
x2  
i 1 ei
where:

oi = observed frequency in the ith cell


ei = expected frequency in the ith cell
k = number of cells

Note:If the ois are close to the eis, the x2 value will be small.
This implies a good fit. Hence, we do not reject Ho.

We proceed with the test only if the eis are each at least equal to 5.
If not, one needs to combine adjacent cells to increase the eis.

4. Critical region and decision on Ho:

Reject Ho if X2  X2α,v.

The number of degrees of freedom (df or v) depends on two factors:

1. The number of cells in the experiment.

2. The number of quantities obtained from observed data that are


necessary in the calculation of the expected frequencies.

Thus: The number of df is equal to the number of cells minus the number
of quantities obtained from the observed data which are used in
the calculation of the expected frequencies.

5. Conclusion

Example 1: A die is tossed 120 times.

1. Ho: Distribution of outcomes is uniform. (Die is balanced.)


Ha: Distribution of outcomes is not uniform. (Die is not balanced.)

2. α = 0.05

3. Test statistic: X2

Frequency Face
1 2 3 4 5 6 T
Observed 20 22 17 18 19 24 120
79

Expected 20 20 20 20 20 20

Note: ei  120 / 6  20 under H o .

( 20  20) 2 ( 24  20) 2
x2   ...   1. 7
20 20

4. Critical region and decision on Ho:

Now, df = 6 cells - 1 = 5; subtrahend is 1 for using only one quantity, the


total frequency (120), to determine the expected frequencies. Thus, the
critical value is x20.05,5 = 11.070. Since 1.7 <11.070, we do not reject Ho.

5. Conclusion: There is no sufficient evidence to indicate that the die


is not balanced. (If 120 trials is already adequate for the researcher,
he can conclude that the die is balanced.)

Example 2:

1. Ho: Frequency distribution of battery lives may be approximated


by the normal distribution.

Ha: Frequency distribution of battery lives cannot be approximated


by the normal distribution.

2. α = 0.05

3. Test statistic: X2
k (o i  e i ) 2
x2  
i 1 ei
where:
ei = P(z1 < Z < z2) x n;
z1 = z value of lower class boundary and
z2 = z value of upper class boundary

That is, ei is obtained from a normal curve having the same mean and
standard deviation as the sample ( = 3.4125, s = 0.6818 for this data
set).

Class Boundaries oi ei
1.45 – 1.95 2 0.56
1.95 – 2.45 1 2.532
2.45 – 2.95 4 6.76
2.95 – 3.45 15 10.944
3.45 – 3.95 10 10.532
3.95 – 4.45 5 6.02
4.45 – 4.95 3 2.09
Total 40 39.438

Sample computations: for e1

1.45  3.4125
z 1  z value of lower limit of first class   2.878
0.6818
80

1.95  3.4125
z 2  z value of upper limit of first class   2.145
0.6818

P(an observation lies in the first class) =P(-2.878 < Z < -2.145)
=P(2.145 < Z < 2.878)
=P(Z < 2.878) – P(Z < 2.145)

=0.9980 – )
= 0.9980 – 0.9840
= 0.014
Thus, e1  (0.014)(40)  0.56.

Combining adjacent classes (first three classes and last two classes,
respectively) to satisfy the requirement all ei :

Class Boundaries o i' ei'


1.45 – 2.95 7 9.852
2.95 – 3.45 15 10.944
3.45 – 3.95 10 10.532
3.95 – 4.95 8 8.11
Hence,
(7  9.852) 2 (15  10.944) 2 (10  10.532) 2 (8  8.11 ) 2
x2    
9.852 10.944 10.532 8.11
 0.8256  1.5032  0.0269  1.4920 x 10 -3
 2.357

4. Critical value and decision of Ho:

Now, df = 4 cells - 3 = 1; subtrahend is 3 for using three quantities ( , s,


and n) in determining the expected frequencies. Thus, the critical value is
x20.05,1 = 3.84.

Since 2.357 < 3.841, we do not reject Ho.

5. Conclusion: There is no sufficient evidence to conclude the alternative


hypothesis. The researcher can conclude that the normal
distribution provides a good fit for the distribution of battery
lives if he finds the sample size adequate for this purpose.

You might also like