Hypothesis Testing
Hypothesis Testing
The business
busine. records.
against the physical inventories manager of a la/8 He true proportion of the i n a c c u r a tthe
e
sample
Sampling error should not be moreaby sample
a
sample surve below
survey.
past experiences.
Determine
The than 5% above
size.
proportion of the inaccurate
e 5% above or
records is estimatea 25o J
or
Solution. In the
notations
we a r eg i v e n :
0-35
of $ 16-5
[ (16-60)J
r e c o r d s =
= n
inaccurate
P= of
Population proportion
E =5% =0-05
Confidence
From standard coefficient
= 1 -a1
normal tables, we know that
P(1ZIS 3) =0-9973 1 3. Substituting in (16-60). w
get
Zlies between the limits Hence zal2=
almost surely. 2-0475 819
0-0025
n 2_0.35 x (1 -0-35)x 9 O 5 S
E2 (0-05)
16-6. TESTING OF HYPOTHESIS
Inductive inference is based on
deciding the characteristics of t .
about
Or siatement, which may or may not be true, about a population or equivalently about the
distribution characterising the probability
given population, which we want to test on the basis of the evidence from a
random sample.
If the hypothesis completely specifies the population, then it is known as simple
hypothesis, otherwise it is known as
composite hypothesis.
In sampling from a normal population N(L, O*), the
hypothesis
H:=Ho and o2=O0?,
is simple hypothesis since it completely specifies the distribution.
a
On the other hand, each of te
following hypothesis is a composite hypothesis:
) =Ho (Nothing known about o)
.
statistical hypothesis is a two-action decision after observing a random sample from the
o fa
A 1etation, the two actions being the acceptance or rejection of the hypothesis under consideration
en pop falsity of a statistical hypothesis is thus based on the information contained in the sample
he truthhe
may be
consistent or inconsistent with the hypothesis and accordingly, the hypothesis may be
iected. It should be clearly borne in mind that the acceptance of a statistical hypothesis is due
eceptedt evidence provided by the sample to reject it and does not necessarily imply that it is true.
1 oi n s u f i c e
statisticsis known
as test
ndent statistic and the
e difference
between a corresponding population parameter, or
betrween two independent statistics,
The difference
(i) be attributed to the fluctuations of sampling, otherwise it is said
to be significant.
nificant if it can
Statistics like Binomial, Poisson, Beta, Gamma 1, F, X
i sn o ts i g n i f i c a n t
Z-S.E.
For any statistic t, the values of E) and S.E.() are invariably in terms of the population parameters
. (16-58)
For example, for the statistic t , E (T) =4 and S.E. ( X) =d/vn and for the statistic t =p sample
proportion), E(p) = P and S.E. (p) = V PQIn. Hence, if the problem is not of the type described in (1) and
above, sometimes we set up the null hypothesis in such a way that it completely specifies the parameters
he produces.
claim is tested. The statistical
If there complaints from the consumers, then the manufacturer's
are
hypothesis that is to be tested is called the null hypothesis, and is denoted by Ho
manufacturer's claim we will test_the statistical
example, to prove or disprove the battery
For contains
null hypotheses is (Ho) : 248 months. (Note that this
hypothesis that u248 months. Thus, the
equality sign also).
The alternative hypothesis, (when Ho is not true) is : H: < 48 months.
2. In testing if the die (dice) is (are) unbiased, we set up the null hypothesis that the die (dice) is (are)
17-lj
unbiased, otherwise we ill position to estimate the population parameters. [See Example
not be in a
1s
16-64. Alternative Hypothesis. Any hypothesis which is complementary to the null hypothesis
denoted by H1. It is very important to explicitly state ne
called a n alternative hypothesis. It is usually of Ho
null hypothesis Ho, because the acceptance or rejection
alternative hypothesis in respect of any the
meaningful only if it is being tested against a rival hypothesis. For example, if we want to
test nul
has specified mean Ho. (say), i.e.,
hypothesis that the population
a
Ho =Ho
could be
then the alternative hypothesis
(i) Hi: # Ho» (i.e., u> Ho or u<uHo)
ii) H: u>Ho
(iii) H: H<Ho
The alternative hypothesis in () is known as atwo-tailed alternative and the alternativesin (i) ss of les"
STIMATION AND
E S T I M A T I O
TESTING OF HYPOTHESIS
16-29
of risk the risk
o f risk
taking of
an
e l e m e n t
be expresse in the
decisions while the decisions (it) and (iv) are wrong decisions.
These
isioris may following
dichotomous table.
Decision from Sample
Reject Ho Accept Ho
True State Ho True Wrong (Type I Error) Correct
Ho False (H, True) Correct Wrong (Type II Error)
Thats. in testing of hypothesis we are likely to commit two types of errors. The error of rejecting Ho
is known as Type I Error and the error of accepting Ho when H is false (i.e., H is true) 1s
when Ho is trueis
as Type
lI Error.
known
II
Remark. Type I and Type Errors:
We make type I error by rejecting a true null hypothesis.
We make type II error by accepting a wrong null hypothesis.
Ifwe write
a=
PRejecting a goodlot .. (16.59a)
respectively.
these errors. But
An ideal test procedure would be one which is so planned as to safeguard against both
these errors simultaneously. An
practically, in any given problem, it is not possible to minimise both
attempt to decrease a results in an increase in p and
vice-versa. In practice, in most of decision-making
than to reject a
problems in business and social sciences, it is more risky to accept a wrong hypothesis
correct one, i.e., consequences of type II error are likely to be more
serious than the consequences of type I
be reduced simultaneously, a compromise is made by
eror. Since for a given sample, both the errors cannot serious error. Thus, we fix a, the size of type I error
less
nimising more serious errors after fixing up the
and then try to obtain a criterion which minimises B, the size of type lI error. We have:
B P[ Type II Error ]
=P[ Accepting Ho when H, is false or H is true ]
when Hg is truej =1
Now P[Accept Ho when Ho is wrong] + P[Accept Ho
P[Accept Ho when Ho is true]
=1-P[Accept Ho when Ho is wrong
. (16-60)
=1-B
16-30 FUNDAMENTAL: OF STATIS
Obviously, when Ho is true, it ought to be accepted. Hence, minimising ß amon
(1-B), which is called the power of the test. Hence, the usual practice in testing of hun a
the size of type I error and then try to obtain a criterion which minimizes B, the si esisis is
to
maximizes (1-B), the power ofthe test.
e \l error fu t
16-6-6. Level of Significance. The maximum size of the typel eror, which we are
Are prepared t
known as the level ofsignificance. It is usually denoted by and is given by:
a
P[ Rejecting Ho when Ho is true =
Commonly used levels ofsignificance in practice are 5% (0-05) and 1% (0-01). If we ad.
significance, it implies that in 5 samples out of 100, we are likely to reject a correct Ho. In ot
66
lel
evel
implies that we are 95% confident that our decision to reject Hg is correct. Level other wOrds, t
of significance
s alwan
fixed in advance before collecting the sanple information.
Remark. When we reject a null hypothesis Ho we have certain confidence in our deci
depends on the level of significance employed. Thus, at 'a level of significance, the degree ofision
co
in our decision is (1 is also calied the confidence coefficient. However,
-a),which whenwe acc
don't have any confidence in our decision. The acceptance is merely due to the fact that
that the san the
does not provide us sufficient evidence against the null bypothesis, sample dat:
16-67. Critical Region. Suppose we take several samples of the same size ffrom a
given populati
andcompute some statistic t, (say , p etc.), for each ofthese samples. Let h, h be the valuesof
statistic for these samples. Each of these values may be used to test some null hypothesis H Someal
may lead to the rejection of Ho while others may lead to acceptance of Ho. These sample statistics t
(comprising the sample space), may be dividedinto two mutually disjoint groups, one leadine tak
rejection of H, and the other leading to acceptance of Ho. The statistics which lead to the rcjeetionof
give us a region called Critical Region (C) or Rejection Region (R), while those which lead tat
acceptance of Ho give us a region called Acceptance Region (A). Thus, if the statistic 1 e C, Hh is reie
and if t e A, Ho is accepted. jected
The sizes of type I and type Il errors in terms ofthe critical region are defined below:
a =
P[Rejecting Ho when H, is true]
PRejecting H, IHol
= P[te CIHl
and B P[Accepting Ho when Ho is wrong]
=PlAccepting Ho when H, is true]
=P[Accepting HolH]
Plte AlH
where C is the critical (rejection)
region, Ais acceptance region and CnA =0, CUA = S (sample space).
Example 16-20. In order to test whether a coin is perfect, it is tossed 5 times. The null hypothesis f
perfectnes is rejected if and only if more than 4 heads are obtained. Obtain the
(i) critical region, (ü) probability of type I error, and
(ii) probability of type Il error, when the
corresponding probability of getting a head is 0-2.
Solution. Let X denote the number of heads obtained is 5
tosses of a coin.
Ho: The coin is perfect i.e., unbiased i.e., Ho:
p=
Under Ho X-Bn =5.p=
P(X=x IH%)="C,p*q*-i=c G
P(X=x lHo) =C,
zxC,: x=0, 1,...5 ()
aF ESTIMATION AND TESTING OF HYPOTHESIS 16-31
THEORY OOF
or Rejection Region
Critical Region
Reject Ho if more than 4 heads are obtained.
Critical region {X>4)
= =
{ X=5}
Probability of type I error (a) is given by:
i)
a = P [Reject HolHol P[X =5 IHol =Xx C
=
=
=0-03125 [From ()D
ample 1621. In order to testwhether a coin is perfect, it is tossed 5 times. The null hypothesis of
perfec
o f the coin is accepted if and only if atmost 3 heads are obrained. Then the power of the test
to the alternative hypothesis that probability of head is 0-4, is
c o r r e s p o n d i n g .
2853 56 (iii)3125
(iv) none of these.
3125 3125
of the coin.
a random toss
Null hypothesis..Ho:p=.
Alternative hypothesis. H1
: p = 0-4 Critical Region: X> 3.
Power test for testing Ho
of the against H1 is given by
1-B =P(Reject Ho when H is truc) =P(Rcject HolH)
5
=
P(X>3 Ip =04) 2 C, (0-4y (0-6)- =
rz4
[:x-B(n 5,p =0-4 under H)] =
=C (O4*(o6)+(04=5x(10x0+(0
=( 3 + ] - s
Hence (i) is the correct answer.
is represented by a portion
and Two-Tailed Tests. In any test, the critical region
16-68. One-Tailed the test statistic.
of the sampling distribution of
of the area under the probability curve or left
where the alternative hypothesis is one-tailed (right-tailed
A test of any statistical hypothesis
test. For example, a test
for testing the mean of a population
tailed) is called a one-tailed
Ho:=Ho
against the alternative hypothesis
H : >Ho (Right-tailed)
or H:4<Ho (Left-tailed).
the critical region lies entirely in the right-tail of
test (H1: > Ho)
S asingle-tailed test. In the right-tailed Ho), the critical region is entirely in the
while for the left-tailed
test (H :u <
ne
Sampling distribution of ,
left tail ofthe distribution of .
two-tailed such as:
where the alternative hypothesis is
A test
of statistical hypothesis
Ho:=Ho
against the alternative hypothesis
and <Ho)
H: * Ho ( >Ho
FUNDAMENTALS OF STATISTICS
16-32
area lying in
case the critical
region is given by the portion of the
Is known as two-tailed test and in such a statistic.
both the tails of the probability curve
of the test
entirely on the
whether one-tailed or tw0-taicd test is
to
be applied depends
two-tailed test and
In a particular problem, we apply
alternative hypotnesis iswo-tailed,
nature of the altermative hypothesis. If the one-tailed test.
one-tailed, we apply manufactured by standar
the alternative hypothesis is brands or bulbs, one
i there are two population
technique (with m e a n
life ua. If.
For example, suppose that by s o m e new we
the other manufactured and alternat
process (with mean life ) and
then our null hypothesis is Ho :
H H2 =
if the product of
test. Similarly, for testing
thus giving us a left-tailed
standard process, we have: and H:>
Ho: H a two-tailed test or a single-tailed (s .
decision about applying right
right-tailed test. Thus, the
thus giving us a under study.
on the problem value of test statistie
or left) test will depend Critical Region. The
Values and
Significant is called the critical value or sionis
16-6-9. Critical Values
o r
(lo%
P(Z<-za) = a
For Left-tailed Test Left-Tailed Test
Right-Tailed Test
't)
(Level of significance 'a) (Level of significance
V Z=0
Z=0
Fig. 16-3a Fig. 163b
Let z be the critical value of Z at level of significance a. for a two-tailed test so una
P(1ZI>z) =a
P(Z>z)+P(Z<-z) = a
( B ys y m m e t r y )
P(Z>)+ P(Z>z) =a
THEOR JATION AND TESTING OF HYPOTHESIs
OF ESTIMA 16-33
2P (Z> z1) = a
P(Z>z) =
Z o2 [From (16-63)
that the each tail of the curve is (o/2).
a r e a on Two-Tailed Test
implies
(*) the
critical values for a two tailed test at level of (Level of significance a')
Hence
as shown in Figure 16.4.
significance
a, are tzau2 Lower critical Upper critical
critical value value
Thius, the significant o r of Z for a value
Ssingle-tailedrtest (left or right) at level of significance 'a Rejection region
value of 2 for a two-tailed test at Rejection region
as the critical (/2) P(a2)
same
is 2a.
significance
level of
Z=0 Za2
Fig. 16-4
1-96 (for a =
0-05) and 2-58 (for a =0-01) for two-tailed
Remarks 1. The significant
values of Z, viz., test should be committed
to memory.
and 2.33 (or a =
0-01) for right -tailed
and 1-645 (for a = 0-05) the alternativve
test
single-tailed test will be determined by
two-tailed test or
The decision about applying a
hypothesis H1.
then from normal probability Table VI, we get,
2. If Z W(O, 1), (By symmetry)
P(1ZIS3) =P(-3Z<3) =2P ( 0 <Z<3) Tables)
2(0-49865) = 09973 (From Normal Probability
CRITICAL REGION
at level of significance a is given by:
The critical (rejection) region . . (16-66)
() Z>za, for right-tailed test
test
(i)Z<-zu ie., IZI>za for left-tailed left-tailed) test, the
critical (rejection) region at level of
.(16-66a)
Her single-tailed (right-tailed or
e,Jor
SEgnificance a is 1 ZI> Za
16-34 FUNDAMENTALS OF STATISTICS
The critical region for fwo-tailed test at level of significance a is given by:
Z> a or Z-an i.e., 1Z1> zan
... (16-66h
16-610. P-Value or Probability Value of Test Statistic
P-Value. The probability that the value oftest statistic is at least as extreme as its comu
the basis of the sample data under Ho, is called its P-value. uted value on
1. For the right tailed test, the P-value is the area to the right of the computed val
ed value of
statistic under Ho. IFigure 16:5(a)] the tcst
2. For the left tailed test, the P-value is the area to the left of the computed value of the ...
under Ho. Figure 16-5(6)). statistic
Right-Tailed Test Left-Tailed Test
P-value P-value
W
0
Test Statistic
Test Statistic
Fig. 165(a)
Fig. 165(6)
P-value P-value
= Twice this area Twice this area
S t e p4 :
Tdentify the sample statistic to be used and its sampling distribution.
Identi
Test
statistic. Define and compute the
test statistic under
Step 5 Ho
Some of the commonly used distributions in obtaming the test statistic or test criterion are
normal, 1, chi-square, and F, and tests based on them are discussed in chapters 17 to I
Obtain the critical value (or values) and the critical (rejection) region of the test statistic from
Step the appropriate tables, given in the appendix at the end of the book.
value of the test statistic lies in the level of
Giep 7: If thecomputed
significance 'a'.
rejection region, we reject Ho at
If the computed value of test statistic lies outside the rejection region, we fail to reject Ho
It should be clearly understood that whenever we reject a null hypoihesis, we have certain
degree of confidencein our decision which depends on the level of significance used.Thus
at a level of significance, the degree of confidence in our decision to reject Ho is (I - a),
which is also called the confidence coefficient. However, the failure to reject H o is not the
same thing as 'accepting Ho'. By this we simply mean that the sample data does not provide
us sufficient evidence against Ho. which may, therefore;be regarded as true.
Step 8: Write the conclusion of the test in simple language.
Remark. If the calculated value of test statistic is greater than the tabulated value (critical or
sionificant value at level of significance ' ) , then we say that it is significant and the null hypothesis Ho is
gn
rejected at leve of significance o i.e., with confidence coefficient (1 - a).
If calculated value of test statistic is less than the tabulated (critical) value, we say that it is not
Significant. By this we mean that the difference between the sample statistic and the corresponding
narameter value under Ho, is just due to the fluctuations of sampling and the sample data does not provid
us sufficient evidence against the null hypothesis, which may therefore, be accepted at 'a' level of
significance.
Method 2:P-VALUE ESTIMATION METHOD
Steps 1 to 5 and Step 8 are the same as in Method1- The Rejection Region Method'. There is slight
variation in Steps 6 and 7, as given below:
Step 6: Find the P-value of the computed test statistic under Ho in Step 5.
[For details, see § 16-6-10]
Step 7: If P-value < oa, we reject Ho at 'o level of significance.
If P-value>a, we fail to reject Ho at 'a' level of significance.
Step 8: Write conclusion of the test in simple language.
Remark. In chapters 17 to 19, we will be using the method 1 i.e., the Rejection Region Method. In all
nese chapters the reader is advised to strictly follow the order of the steps explained above.
EXERCISE 16-1
. Discuss briefly the importance of estimation theory in decision making in the face of uncertainty.
2.
Explain, with illustrations, the concept of
Point Estimation and (i) Interval Estimation.
. Describe the important properties of a good estimator,
Define the following terms and give one example for each.
i) Unbiased Statistic, (i) Consistent Statistic,
(iii) Efficient Statistic and (iv) Sufficient Statistic.