0% found this document useful (0 votes)

323 views22 pages

Mb0050 SLM Unit11

The document discusses chi-square analysis and its applications. It begins by introducing chi-square tests and their uses in testing goodness of fit and the independence and equality of variables. It then provides examples of chi-square tests for goodness of fit, including testing if customer preferences for ice cream flavors match suppliers' claims, and testing if car accidents are equally likely on each day of the week. The document explains how to calculate chi-square statistics and compare them to critical values to evaluate null hypotheses.

Uploaded by

Margabandhu Narasimhan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

323 views22 pages

Mb0050 SLM Unit11

Uploaded by

Margabandhu Narasimhan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Research Methodology

Unit 11

Chi-Square Analysis

Structure
11.1 Introduction
Objectives
11.2 A Chi-square Test for the Goodness of Fit
11.3 A Chi-square Test for the Independence of Variables
11.4 A Chi-square Test for the Equality of More than
Two Population Proportions
11.5 Case Study
11.6 Summary
11.7 Glossary
11.8 Terminal Questions
11.9 Answers
11.10 References

11.1 Introduction
In the last unit, we discussed the Z test for the equality of two population proportions.
Now, in case we have more than two populations and want to test the equality of
all of them simultaneously, it is not possible to do it using Z test. This is because
Z test can examine the equality of two proportions at a time. In such a situation,
the chi-square test can come to our rescue and can carry out the test in one go.
The chi-square test is widely used in research. For the use of chi-square
test, data is required in the form of frequencies. Data expressed in percentages
or proportion can also be used, provided it could be converted into frequencies.
The majority of the applications of chi-square (2) are with discrete data. The
test could also be applied to continuous data, provided it is reduced to certain
categories and tabulated in such a way that the chi-square may be applied.
Some of the important properties of the chi-square distribution are:
Unlike the normal and t distribution, the chi-square distribution is not
symmetric.
The values of a chi-square are greater than or equal to zero.
The shape of a chi-square distribution depends upon the degrees of
freedom. With the increase in degrees of freedom, the distribution tends
to normal.
Sikkim Manipal University

Page No. 257

Research Methodology

Unit 11

There are many applications of a chi-square test. Some of them mentioned

below will be discussed in this unit:
A chi-square test for the goodness of fit
A chi-square test for the independence of variables
A chi-square test for the equality of more than two population proportions.
Objectives
After studying this unit, you should be able to:
discuss various applications of chi-square tests like:
o a chi-square test for the goodness of fit
o a chi-square test for the independence of variables
o a chi-square test for the equality of more than two population
proportions

11.2 A Chi-square Test for the Goodness of Fit

As discussed before, the data in chi-square tests is often in terms of counts or
frequencies. The actual survey data may be on a nominal or higher scale of
measurement. If it is on a higher scale of measurement, it can always be
converted into categories. The real world situations in business allow for the
collection of count data, e.g., gender, marital status, job classification, age and
income. Therefore, a chi-square becomes a much sought after tool for analysis.
The researcher has to decide what statistical test is implied by the chi-square
statistic in a particular situation. Below are discussed common principles of all
the chi-square tests. The principles are summarized in the following steps:
State the null and the alternative hypothesis about a population.
Specify a level of significance.
Compute the expected frequencies of the occurrence of certain events
under the assumption that the null hypothesis is true.
Make a note of the observed counts of the data points falling in different
cells
Compute the chi-square value given by the formula.
2
k 1

Sikkim Manipal University

(Oi Ei )2
Ei
i 1
k

Page No. 258

Research Methodology

Unit 11

Where,
Oi = Observed frequency of ith cell
Ei = Expected frequency of ith cell
k = Total number of cells
k 1 = degrees of freedom
Compare the sample value of the statistic as obtained in previous step
with the critical value at a given level of significance and make the decision.
A goodness of fit test is a statistical test of how well the observed data
supports the assumption about the distribution of a population. The test also
examines that how well an assumed distribution fits the data. Many a times, the
researcher assumes that the sample is drawn from a normal or any other
distribution of interest. A test of how normal or any other distribution fits a given
data may be of some interest.
Consider, for example, the case of the multinomial experiment which is
the extension of a binomial experiment. In the multinomial experiment, the
number of the categories k is greater than 2. Further, a data point can fall into
one of the k categories and the probability of the data point falling in the ith
category is a constant and is denoted by pi where i = 1, 2, 3, 4, ..., k. In summary,
a multinomial experiment has the following features:
There are fixed number of trials.
The trials are statistically independent.
All the possible outcomes of a trial get classified into one of the several
categories.
The probabilities for the different categories remain constant for each
trial.
Consider as an example that a respondent can fall into any one of the
four non-overlapping income categories. Let the probabilities that the respondent
will fall into any of the four groups may be denoted by the four parameters p1,
p2, p3 and p4. Given these, the multinomial distribution with these parameters,
and n the number of people in a random sample, specifies the probabilities of
any combination of the cell counts.
Given such a situation, we may use a multinomial distribution to test how
well the data fits the assumption of k probability p1, p2, ..., pk of falling into the k
cells. The hypothesis to be tested is:

Sikkim Manipal University

Page No. 259

Research Methodology

Unit 11

H0 : Probabilities of the occurrence of events E1, E2, ..., Ek are given by

the specified probabilities p1, p2, ..., pk
H1 : Probabilities of the k events are not the pi stated in the null hypothesis.
Such hypothesis could be tested using the chi-square statistics. Below
are given a set of illustrated examples.
Example 11.1: The manager of ABC ice-cream parlour has to take a decision
regarding how much of each flavour of ice-cream he should stock so that the
demands of the customers are satisfied. The ice-cream suppliers claim that
among the four most popular flavors, 62 per cent customers prefer vanilla, 18
per cent chocolate, 12 per cent strawberry and 8 per cent mango. A random
sample of 200 customers produces the results as given below. At the = 0.05
significance level, test the claim that the percentages given by the supplies are
correct.
Flavour
Number preferring

Vanilla

Chocolate

Strawberry

Mango

120

Solution:
Let
pv : proportion of customers preferring vanilla flavour.
pc : proportion of customers preferring chocolate flavour.
ps : proportion of customers preferring strawberry flavour.
pm : proportion of customers preferring mango flavour.
H0 : pv = 0.62, pc = 0.18, ps = 0.12, pm = 0.08
H1 : Proportions are not that specified in the null hypothesis
The expected frequencies corresponding to the various flavors under the
assumption that the null hypothesis is true are:
Vanilla = 200 0.62 = 124
Chocolate = 200 0.18 = 36
Strawberry = 200 0.12 = 24
Mango = 200 0.08 = 16
(Oi Ei )2
Ei
i 1
k

The computations for 32 are as under:

Sikkim Manipal University

Page No. 260

Research Methodology

Flavour
Vanilla
Chocolate
Strawberry
Mango
Total

O
(Observed
Frequencies)
120
40
18
22

Unit 11

E
(Expected
Frequencies)
124
36
24
16

OE
4
4
6
6

(O E)2
16
16
36
36

(O E )2
E
0.129
0.444
1.500
2.250
4.323

The computed value of chi-square is 4.323.

Table 32 (5 per cent) = 9.488 (see Appendix 3 at the end of the book.)

Rejection region for Example 11.1

As sample 2 lies in the acceptance region, accept H0. Therefore, the customer
preference rates are as stated.
It may be worth pointing out that for the application of a chi-square test,
the expected frequency in each cell should be at least 5.0. Further the sample
observation should be independently and randomly taken. In case it is found
that one or more cells have the expected frequency less than 5, one could still
carry out the chi-square analysis by combining them into meaningful cells so
that the expected number has a total of at least 5. Another point worth mentioning
is that the degree of freedom, usually denoted by df in such cases, is given by
k 1, where k denotes the number of cells (categories).
It may be noted that in Example 11.1, the hypothesized probabilities were
not equal. There are situations where the hypothesized probabilities in each
category are equal or in other words, the interest is in investigating the uniformity
of the distribution. The following example would illustrate it.

Sikkim Manipal University

Page No. 261

Research Methodology

Unit 11

Example 11.2: An insurance company provides auto insurance and is analysing

the data obtained from fatal crashes. A sample of the motor vehicle deaths is
randomly selected for a two-year period. The number of fatalities is listed below
for the different days of the week. At the 0.05 significance level, test the claim
that accidents occur on different days with equal frequency.
Day

Monday

Tuesday

Wednesday

Thursday

Friday

Saturday

Sunday

Number of
fatalities

Solution:
Let
p1 = Proportion of fatalities on Monday
p2 = Proportion of fatalities on Tuesday
p3 = Proportion of fatalities on Wednesday
p4 = Proportion of fatalities on Thursday
p5 = Proportion of fatalities on Friday
p6 = Proportion of fatalities on Saturday
p7 = Proportion of fatalities on Sunday
H0 : p1 = p2 = p3 = p4 = p5 = p6 = p7 =

1
7

H1 : At least one of these proportions is incorrect.

n = Total frequency = 31 + 20 + 20 + 22 + 22 + 29 + 36 = 180
The expected number of fatalities on each day of the week under the
assumption that the null hypothesis is true is given as under:
Monday = 180

1
= 25.714
7

Tuesday = 180

1
= 25.714
7

Wednesday = 180

1
= 25.714
7

Thursday = 180

1
= 25.714
7

Friday = 180

1
= 25.714
7

Sikkim Manipal University

Page No. 262

Research Methodology

Unit 11

Saturday = 180

1
= 25.714
7

Sunday = 180

1
= 25.714
7

The computation of sample chi-square value is given in the following table:

Day

Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Sunday

Observed
Frequencies
(O)
31

Expected
Frequencies
(E)
25.714

20
20
22
22
29
36

25.714
25.714
25.714
25.714
25.714
25.714
Total

The value of sample

Degrees of freedom
Critical (Table)

2 =

(O E)2

(O E)2
E

5.286

27.942

1.087

5.714
5.714
3.714
3.714
3.286
10.286

32.650
32.650
13.794
13.794
10.798
105.802

1.270
1.270
0.536
0.536
0.420
4.114
9.233

(O E )2
= 9.233
E

= 71=6
26 = 12.592

Since the sample chi-square value is less than the tabulated 2, there is
not enough evidence to reject the null hypothesis as shown in the figure below.

Rejection region for Example 11.2

Sikkim Manipal University

Page No. 263

Research Methodology

Unit 11

Self-Assessment Questions
1. For the application of a chi-square test, the expected frequency in each
cell should be at least five. (True/False)
2. The sample value of the chi-square can be negative. (True/False)
3. If there are k categories of data, the degree of freedom would be _______.

11.3 A Chi-square Test for Independence of Variables

The chi-square test can be used to test the independence of two variables each
having at least two categories. The test makes use of contingency tables, also
referred to as cross-tabs with the cells corresponding to a cross classification of
attributes or events. A contingency table with 3 rows and 4 columns (as an
example) is shown in Table 11.1.
Table 11.1 Contingency Table with 3 Rows and 4 Columns

Second
Classification
Category
1
2
3
Total

First Classification Category

Total

O11
O21
O31
C1

O12
O22
O32
C2

O13
O23
O33
C3

O14
O24
O34
C4

R1
R2
R3
n

Assuming that there are r rows and c columns, the count in the cell
corresponding to the ith row and the jth column is denoted by Oij, where i = 1, 2,
..., r and j = 1, 2, ..., c. The total for row i is denoted by Ri whereas that
corresponding to column j is denoted by Cj. The total sample size is given by n,
which is also the sum of all the r row totals or the sum of all the c column totals.
The hypothesis test for independence is:
H0 : Row and column variables are independent of each other.
H1 : Row and column variables are not independent.
The hypothesis is tested using a chi-square test statistic for independence
given by:
r

=
2

Sikkim Manipal University

i 1 j 1

(Oij Eij )2
Eij

Page No. 264

Research Methodology

Unit 11

The degrees of freedom for the chi-square statistic are given by (r 1)

(c 1).
For a given level of significance , the sample value of the chi-square is
compared with the critical value for the degree of freedom (r 1) (c 1) to make
a decision.
The expected frequency in the cell corresponding to the ith row and the jth
column is given by:
Eij

Where,

Ri C j
n

Ri = Total for the ith row

Cj = Total for the jth column
n = Total sample size.

Let us consider a few examples:

Example 11.3: A sample of 870 trainees was subjected to different types of
training classified as intensive, good and average and their performance was
noted as above average, average and poor. The resulting data is presented in
the table below. Use a 5 per cent level of significance to examine whether there
is any relationship between the type of training and performance.
Performance
Above average
Average
Poor
Total

Intensive
100
100
50
250

Training
Good
Average
150
40
100
100
80
150
330
290

Total
290
300
280
870

Solution:
H0 : Attribute performance and the training are independent.
H1 : Attribute performance and the training are not independent.
The expected frequencies corresponding the ith row and the jth column in
the contingency table are denoted by Eij, where i = 1, 2, 3 and j = 1, 2, 3.

Sikkim Manipal University

E1,1 =

290 250
= 83.33
870

E1,2 =

290 330
= 110.00
870

Page No. 265

Research Methodology

Unit 11

E1,3 =

290 290
= 96.67
870

E2,1 =

300 250
= 86.21
870

E2,2 =

300 330
= 113.79
870

E2,3 =

300 290
= 100.00
870

E3,1 =

280 250
80.46
870

E3,2 =

280 330
= 106.21
870

E3,3 =

280 290
= 93.33
870

The table of the observed and expected frequencies corresponding to

the i row and the jth column and the computation of the chi-square are given in
the table below.
th

(Oij E ij )2
Row, Column

Oij

1,1
1,2
1,3
2,1
2,2
2,3
3,1
3,2
3,3

100
150
40
100
100
100
50
80
150

(Oij Eij)2

Eij
83.33
110.00
96.67
86.21
113.79
100.00
80.46
106.21
93.33

277.89
1600.00
3211.49
190.16
190.16
0
927.81
686.96
3211.49

Total
r

Sample =
2

i 1 j 1

(Oij Eij )2
Eij

E ij
3.335
14.545
33.221
2.21
1.671
0.000
11.53
6.468
34.41
107.39

= 107.39

The critical value of the chi-square at 5 per cent level of significance with
4 degrees of freedom is given by 9.49. The sample value of the chi-square falls
in the rejection region as shown in the figure below.
Sikkim Manipal University

Page No. 266

Research Methodology

Unit 11

Rejection region for Example 11.3

Therefore, the null hypothesis is rejected and one can conclude that there
is an association between the type of training and performance.
Example 11.4: The following table gives the number of good and defective
parts produced by each of the three shifts in a factory:
Shift
Day
Evening
Night
Total

Good
900
700
400
2000

Defective
130
170
200
500

Total
1030
870
600
2500

Is there any association between the shift and the equality of the parts
produced? Use a 0.05 level of significance.
Solution:
H0 : There is no association between the shift and the quality of parts
produced.
H1 : There is an association between the shift and quality of parts.
The computations of the expected frequencies corresponding to the ith
row and the jth column of the contingency table are shown below: (i = 1, 2, 3)
and (j = 1, 2).

Sikkim Manipal University

E1,1 =

1,030 2,000
= 824
2,500

E1,2 =

1,030 500
= 206
2,500
Page No. 267

Research Methodology

Unit 11

E2,1 =

870 2,000
= 696
2,500

E2,2 =

870 500
= 174
2,500

E3,1 =

600 2,000
= 480
2,500

E3,2 =

600 500
= 120
2,500

The table of the observed and expected frequencies corresponding to

the ith row and the jth column and the computation of the chi-square is given
below:
(Oij E ij )2
Row, Column
1,1
1,2
2,1
2,2
3,1
3,2

Oij
900
130
700
170
400
200

(Oij Eij)2
5776
5776
16
16
6400
6400

Eij
824
206
696
174
480
120
Total
3

The sample chi-square is 2 =

i 1 j 1

(Oij Eij )2
Eij

E ij
7.010
28.039
0.023
0.092
13.333
53.333
101.83

= 101.83

The critical value of the chi-square with 2 degrees of freedom at 5 per

cent level of significance is given by 5.991. The null hypothesis is rejected as
the sample chi-square lies in the rejection region as shown in the figure below.
Therefore, the quality of parts produced is related to the shifts in which they
were produced.

Sikkim Manipal University

Page No. 268

Research Methodology

Unit 11

Rejection region for Example 11.4

It may be worth mentioning again that for the application of a chi-square
test of independence, the sample should be selected at random and the expected
frequency in each cell should be at least 5.
Activity 1
Conduct a survey of 300 households and note down their religion and the
food habits (vegetarian or non-vegetarian). Cross tabulate this data and
examines statistically the hypothesis that food habits are independent of
the religion.

Self-Assessment Questions
4. In a cross table, where chi-square test is applied the null hypothesis is
that the two variables are related. (True/False)
5. The expected frequencies in a cross table are computed under the
assumption that null hypothesis is true. (True/False)
6. If any cell has a zero frequency, the chi-square cannot be applied. (True/
False)
7. The sum of each row and each column for the observed and expected
frequencies need not be equal. (True/False)

Sikkim Manipal University

Page No. 269

Research Methodology

Unit 11

11.4 A Chi-square Test for the Equality of More than

Two Population Proportions
In certain situations, the researchers may be interested to test whether the
proportion of a particular characteristic is the same in several populations. The
interest may lie in finding out whether the proportion of people liking a movie is
the same for the three age groups 25 and under, over 25 and under 50, and
50 and over. To take another example, the interest may be in determining whether
in an organization, the proportion of the satisfied employees in four categories
class I, class II, class III and class IV employeesis the same. In a sense, the
question of whether the proportions are equal is a question of whether the three
age populations of different categories are homogeneous with respect to the
characteristics being studied. Therefore, the tests for equality of proportions
across several populations are also called tests of homogeneity.
The analysis is carried out exactly in the same way as was done for the
other two cases. The formula for a chi-square analysis remains the same.
However, two important assumptions here are different.
(i) We identify our population (e.g., age groups or various class employees)
and the sample directly from these populations.
(ii) As we identify the populations of interest and the sample from them directly,
the sizes of the sample from different populations of interest are fixed.
This is also called a chi-square analysis with fixed marginal totals. The
hypothesis to be tested is as under:
H0 : The proportion of people satisfying a particular characteristic is the
same in population.
H1 : The proportion of people satisfying a particular characteristic is not
the same in all populations.
The expected frequency for each cell could also be obtained by using the
formula as explained earlier. There is an alternative way of computing the same,
which would give identical results. This is shown in the following example:
Example 11.5: An accountant wants to test the hypothesis that the proportion
of incorrect transactions at four client accounts is about the same. A random
sample of 80 transactions of one client reveals that 21 are incorrect; for the
second client, the number is 25 out of 100; for the third client, the number is 30
out of 90 sampled and for the fourth, 40 are incorrect out of a sample of 110.
Conduct the test at = 0.05.
Sikkim Manipal University

Page No. 270

Research Methodology

Unit 11

Solution:
Let

p1 = Proportion of incorrect transaction for 1st client

p2 = Proportion of incorrect transaction for 2nd client
p3 = Proportion of incorrect transaction for 3rd client
p4 = Proportion of incorrect transaction for 4th client

Let

H0 : p1 = p2 = p3 = p4
H1 : All proportions are not the same.
The observed data in the problem can be rewritten as:

Transactions
Incorrect transactions
Correct transactions
Total

Client 1
21
59
80

Client 2
25
75
100

Client 3
30
60
90

Client 4
40
70
110

Total
116
264
380

An estimate of the combined proportion of the incorrect transactions under

the assumption that the null hypothesis is true:
21 25 30 40

116

p = 80 100 90 110 380 = 0.305

q = combined proportion of the correct transaction
= 1 p = 1 0.305 = 0.695
Using the above, the expected frequencies corresponding to the various
cells are computed as shown below:
Transactions

Client 1

Client 2

Client 3

Client 4

Total

Incorrect
transactions

80 0.305 = 24.4

100 0.305 = 30.5 90 0.305 = 27.45

110 0.305 = 33.55

115.9

Correct
transactions

80 0.695 = 55.6

100 0.695 = 69.5 90 0.695 = 62.55

110 0.695 = 76.45

264.1

110

380

Total

100

In fact, the sum of each row/column in both the observed and expected
frequency tables should be the same. Here, a bit of discrepancy is found because
of the rounding of the error. It can be easily verified that the expected frequencies
in each cell would be the same using the formula as Eij

Ri C j
n

already

explained. Now the value of the chi-square statistic can be calculated as:
Sikkim Manipal University

Page No. 271

Research Methodology

=
2

(Oij Eij )2

i 1 j 1

Eij

Unit 11

(21 24.4)2 (25 30.5)2 (30 27.45)2 (40 33.55)2

24.4
30.5
27.45
33.55

(59 55.6)2 (75 69.5)2 (60 62.55)2 (70 76.45)2

55.6
69.5
62.55
76.45

= 0.474 + 0.992 + 0.237 + 1.240 + 0.208 + 0.435 + 0.104 + 0.544

= 4.234
Degrees of freedom (df) = (2 1) (4 1) = 3
The critical value of the chi-square with 3 degrees of freedom at 5 per
cent level of significance equals 7.815. Since the sample value of 2 is less than
the critical value, there is not enough evidence to reject the null hypothesis.
Therefore, the null hypothesis is accepted. Therefore, there is no significant
difference in the proportion of incorrect transaction for the four clients.
Activity 2
Go to an MBA college where the students are admitted from engineering,
commerce, science and other backgrounds. Take a sample of 200 students
and examine whether they are uniformly distributed over all the four abovementioned categories.

Self-Assessment Questions
8. If there are 3 rows and two columns, the degrees of freedom for chisquare test are ________.
9. The combined estimate of proportion is obtained under the assumption
that __________ is true.
10. To test the equality of three population proportions, the alternative
hypothesis is written as H1 : p1 = p2 = p3. (True/false)

11.5 Case Study

Preference for Fast Food
Mahesh Enterprises (ME) has a chain of high class restaurants in Punjab
and Haryana serving high quality multicuisine food at premium prices. The
restaurants serve only lunch and dinner. The top management of the
Sikkim Manipal University

Page No. 272

Research Methodology

Unit 11

restaurants observes that the total sales revenues of the restaurants have
been more or less stagnant, growing at a rate of 2 per cent only for the last
three years. A meeting of the senior management personnel was called to
discuss the issue. Some of them were of the opinion that young customers
in the age group of 1835 were switching to fast food. Further, they were of
the view that the trend is mainly among people belonging to high incomegroup and to families where both partners were economically employed.
In the series of meetings held by the top management, it was decided to
launch a chain of fast food joints in states where they were already present.
However, before starting the fast food joint, they got a survey conducted to
understand the preference of people for fast food. A sample of 100
respondents was chosen.
Data was collected on preference for fast food on an interval scale where
the respondents were asked to rate their preference for fast food on a 5point scale, where 1=not at all preferred, 2=not preferred, 3=neutral,
4=preferred, and 5=very much preferred. Further, the variable preference
was redefined as not preferred for those having a score of 13, and preferred
for those having a score of 45. The actual age of the respondents was
taken and divided into two categories. Those less than or equal 40 years of
age were treated as younger respondents, whereas, those having age of
above 40 were treated as older respondents. There were three income
categories: low income (household with monthly income less than `25,000/-),
middle income (household with monthly income of `25,000/- or more but
less than `50,000/-), high income ((household with monthly income more
than `50,000/-). The data on gender of the respondents was also taken. A
cross tabulation was carried out with preference for fast food with age,
gender and income. The results of the cross tables are reported below in
Table 1 to Table 3.
Table 1 Cross-tabulation of Preference for Fast Food with Age
Age

Total

Younger Respondents

Older Respondents

Not preferred

Count

Preferred

Count

Total

Count

100

Sikkim Manipal University

Page No. 273

Research Methodology

Unit 11

Table 2 Cross-table of Preference for Fast Food with Gender

Gender

Total

Male

Female

Not preferred

Count

Preferred

Count

Total

Count

100

Table 3 Cross-tabulation of Preference for Fast Food with Income

Income
Total

Low Income

Middle
Income

High Income

Not preferred

Count

Preferred

Count

Total

Count

100

Discussion Questions
1. Using the data as given in tables 13, examine the hypothesis that
preference for fast food is related to (i) age, (ii) gender, and (iii) income.
You may use 5 per cent level of significance.
2. Write a summary of the findings.
[Hint: To examine the hypothesis for the relationship for preference for
fast food with the age, the following hypothesis would be tested.
H0 : Preference for fast food is independent of age
H1 : Preference for fast food is related to age
The expected frequencies would be obtained as in Section 11.3. Using
this, the value of chi-square can be computed and the hypothesis be tested.
Similarly the other two cases can be handled.]

11.6 Summary
Let us recapitulate the important concepts discussed in this unit:
Chi-square test has a variety of applications in research. Chi-square is
non-symmetrical distribution taking non-negative values.
Sikkim Manipal University

Page No. 274

Research Methodology

Unit 11

It can be used to test the goodness of fit of a distribution, independence

of variables and equality of more than two population proportions.
A necessary condition for the application of chi-square test is that the
expected frequency in each cell should be at least 5.
The first and foremost thing for the application of chi-square is the
computation of expected frequencies.
The data in chi-square test is in terms of counts or frequencies. In case
the actual data is on a scale higher than that of nominal or ordinal, it can
always be converted into categories.

11.7 Glossary
Degrees of freedom: These are given by (r1) (c1) for a contigency
table.
Chi-square distribution: This is a non-symmetric distribution taking only
non-negative values.
Non-symmetric distribution: Those distributions that are skewed towards
any one tail of the distribution.

11.8 Terminal Questions

1. What is a 2 test? Point out its applications. Under what conditions is this
test applicable?
2. What is chi-square test of the goodness of fit? What precautions are
necessary while applying this test? Point out its role in business decisionmaking.
3. A cigarette company interested in the relation between sex of a person
and the type of cigarettes smoked has collected the following data from a
random sample of 150 persons:
Cigarette

Male

Female

Total

150

Test whether the type of cigarette smoked and the sex are independent.
Sikkim Manipal University

Page No. 275

Research Methodology

Unit 11

4. A survey was carried out in a state among the doctors belonging to the
rural health service cadre (500 doctors) and among the medical education
directorate cadre (300 teaching doctors). They were asked a question,
Would it be acceptable to you, if the government proposes to hire all the
doctors on a fixed period contractual basis? The doctors were to answer
either as Acceptable or Not Acceptable. There was no third category
Undecided. The following was the data compiled in a cross-tabulated
format:
Doctors

Acceptable

Not Acceptable

Total

Rural Cadre

195

305

500

Teaching Cadre

140

160

300

Total

335

465

800

Test an appropriate hypothesis using a 5 per cent level of significance.

5. The following figures show the distribution of the digits in numbers chosen
at random from a telephone directory:
Digit

Frequency 1,026

Total

1,107

997

966

1,075

933

1,107

972

964

853

10,000

Test whether the digits may be taken to occur equally in the directory.

11.9 Answers
Answers to Self-Assessment Questions
1. True
2. False
3. K-1
4. False
5. True
6. True
7. False
8. 3

Sikkim Manipal University

Page No. 276

Research Methodology

Unit 11

9. Null hypothesis
10. False

Answers to Terminal Questions

1. Chi-square is a test which is very widely used as it does not require very
strict assumptions for its applicability. Refer to Section 11.1 for further
details.
2. It tells us whether the given data is taken from a particular distribution.
Refer to Section 11.2 for further details.
3. This is the test for independence of variables. Refer to Section 11.3 for
further details.
4. This is the test for independence of variables. Refer to Section 11.3 for
further details.
5. This is the test on equality of more than two population proportion. Refer
to Section 11.4 for further details.

11.10 References
Chawla D and Sondhi, N. (2011). Research Methodology: Concepts and
Cases, New Delhi: Vikas Publishing House.
Kothari, C R. (1990). Research Methodology: Methods and Techniques.
New Delhi: Wiley Eastern.
Zikmund, William G. (2000). Business Research Methods. Fort Worth:
Dryden Press,

Sikkim Manipal University

Page No. 277

Algebra Refresher 1
100% (1)
Algebra Refresher 1
2 pages
Sipcot Ranipet
79% (42)
Sipcot Ranipet
18 pages
Relationship Between Science Technology
100% (1)
Relationship Between Science Technology
2 pages
English For Science and Technology New (Esp)
No ratings yet
English For Science and Technology New (Esp)
7 pages
Chapter 1 Review
75% (4)
Chapter 1 Review
2 pages
Chi Square Test
No ratings yet
Chi Square Test
24 pages
Business Statistics Chap11 TIF BSAFC5+习题
No ratings yet
Business Statistics Chap11 TIF BSAFC5+习题
7 pages
Sociology Module - Sociology of Ethiopian Society
100% (4)
Sociology Module - Sociology of Ethiopian Society
124 pages
Chi Square Test
100% (2)
Chi Square Test
75 pages
Chi Square (KI Square) Test
No ratings yet
Chi Square (KI Square) Test
30 pages
5 Chi Square
No ratings yet
5 Chi Square
36 pages
CH 11 Notes
No ratings yet
CH 11 Notes
20 pages
Unit 7 Chi Square Test
No ratings yet
Unit 7 Chi Square Test
23 pages
Nonparametric Methods: Chi-Square Applications
No ratings yet
Nonparametric Methods: Chi-Square Applications
21 pages
Chi-Square Test & Contingency Analysis
No ratings yet
Chi-Square Test & Contingency Analysis
16 pages
Chi Square Test
100% (1)
Chi Square Test
52 pages
Definition of Chi-Square Test
100% (1)
Definition of Chi-Square Test
8 pages
Chi Square Exercises
100% (1)
Chi Square Exercises
14 pages
1 - CA51018 - Chi Square - Introduction - Goodness of Fit Test - 2
No ratings yet
1 - CA51018 - Chi Square - Introduction - Goodness of Fit Test - 2
36 pages
Statistics: The Chi Square Test
No ratings yet
Statistics: The Chi Square Test
41 pages
Chi Square
No ratings yet
Chi Square
50 pages
Chapter 4-6
No ratings yet
Chapter 4-6
33 pages
Chapter 7 - Chi Square
No ratings yet
Chapter 7 - Chi Square
7 pages
Module 5a Chi Square - Introduction - Goodness of Fit Test
No ratings yet
Module 5a Chi Square - Introduction - Goodness of Fit Test
39 pages
Lecture 05
No ratings yet
Lecture 05
92 pages
Chapter 9 - Chi-Square Test
No ratings yet
Chapter 9 - Chi-Square Test
3 pages
Chapter11 Stats
No ratings yet
Chapter11 Stats
6 pages
Chi Square
No ratings yet
Chi Square
37 pages
MAED 204 PA 299 Module 10 in Stat.
No ratings yet
MAED 204 PA 299 Module 10 in Stat.
8 pages
Univariate Statistics: Statistical Inference: Testing Hypothesis
No ratings yet
Univariate Statistics: Statistical Inference: Testing Hypothesis
28 pages
Chi-Square Test Presentation
100% (1)
Chi-Square Test Presentation
28 pages
7 Chi-Square and F
No ratings yet
7 Chi-Square and F
68 pages
AI22 Chi Square Goodness of Fit Test
No ratings yet
AI22 Chi Square Goodness of Fit Test
15 pages
MB0050 Slides Unit 11
No ratings yet
MB0050 Slides Unit 11
10 pages
Lecture 13-14-15 Chi - Square Test
No ratings yet
Lecture 13-14-15 Chi - Square Test
22 pages
Chapter Four
No ratings yet
Chapter Four
12 pages
BRM Chi Square Test
No ratings yet
BRM Chi Square Test
13 pages
2007 Research Report Full
No ratings yet
2007 Research Report Full
98 pages
Non Parametric Test
No ratings yet
Non Parametric Test
102 pages
A Phenomenological Study of TH Passers and NonPassers in The Licensure Examination For Teachers
100% (1)
A Phenomenological Study of TH Passers and NonPassers in The Licensure Examination For Teachers
18 pages
Chi Square Handaouts
No ratings yet
Chi Square Handaouts
6 pages
LED Bulb Smart Camera User Manual: Making Home Safe & Secure..
No ratings yet
LED Bulb Smart Camera User Manual: Making Home Safe & Secure..
9 pages
LED Bulb Smart Camera User Manual: Making Home Safe & Secure..
No ratings yet
LED Bulb Smart Camera User Manual: Making Home Safe & Secure..
9 pages
Estudio y Analisis Del Discurso Narrativo
No ratings yet
Estudio y Analisis Del Discurso Narrativo
619 pages
Summary of Ecological Anthropology
No ratings yet
Summary of Ecological Anthropology
2 pages
Chi Square Test
No ratings yet
Chi Square Test
16 pages
G8D Guidelines G8 D Process Objectives Illustration
No ratings yet
G8D Guidelines G8 D Process Objectives Illustration
10 pages
5 Basic Steps in Hypothesis Test: Men Willingly Believe What They Wish." - Julius Caesar (100-44 BC)
No ratings yet
5 Basic Steps in Hypothesis Test: Men Willingly Believe What They Wish." - Julius Caesar (100-44 BC)
11 pages
Module 5 Quiz Rev
No ratings yet
Module 5 Quiz Rev
118 pages
IGNOU Stats Inference Chi Square Block 7 PDF
No ratings yet
IGNOU Stats Inference Chi Square Block 7 PDF
22 pages
D3D Wi-Fi GSM Smart Alarm System - User Manual
No ratings yet
D3D Wi-Fi GSM Smart Alarm System - User Manual
28 pages
Intended Learning Outcomes: and Student Assessment
No ratings yet
Intended Learning Outcomes: and Student Assessment
43 pages
Ch. 10.1, 10.2
No ratings yet
Ch. 10.1, 10.2
42 pages
1 Stat511 U4-1
No ratings yet
1 Stat511 U4-1
45 pages
.Uksqafiles CCCNQ 2024 Exam Timetable PDF
No ratings yet
.Uksqafiles CCCNQ 2024 Exam Timetable PDF
36 pages
Hartmann 2019
No ratings yet
Hartmann 2019
47 pages
Ombc 106 Notes U11
No ratings yet
Ombc 106 Notes U11
4 pages
God, Logic, and Quantum Information
No ratings yet
God, Logic, and Quantum Information
37 pages
Chapter 11
No ratings yet
Chapter 11
16 pages
Higher Education Graduates by Discipline PDF
No ratings yet
Higher Education Graduates by Discipline PDF
1 page
BS IMI U8 Oct23
No ratings yet
BS IMI U8 Oct23
100 pages
Notice of Acceptance-Sir Su
No ratings yet
Notice of Acceptance-Sir Su
4 pages
SLM-Unit 04
No ratings yet
SLM-Unit 04
27 pages
Course Outline in PR1 (2021-2022)
100% (1)
Course Outline in PR1 (2021-2022)
3 pages
Lecture3 - Contingency Analysis
No ratings yet
Lecture3 - Contingency Analysis
16 pages
Mb0050 SLM Unit10
No ratings yet
Mb0050 SLM Unit10
30 pages
Fikom Up Confrerence 2012 Proceedings
100% (1)
Fikom Up Confrerence 2012 Proceedings
46 pages
Mb0050 SLM Unit12
No ratings yet
Mb0050 SLM Unit12
22 pages
5 Measures of Language Proficiency
No ratings yet
5 Measures of Language Proficiency
17 pages
1faculty Development Programs - evidencev5-NA
No ratings yet
1faculty Development Programs - evidencev5-NA
69 pages
J Techfore 2021 120594
No ratings yet
J Techfore 2021 120594
21 pages
ACT Practice Test 2016 2017 Answers
No ratings yet
ACT Practice Test 2016 2017 Answers
5 pages
Annex 1 PDF
No ratings yet
Annex 1 PDF
3 pages
Non Parametric
No ratings yet
Non Parametric
37 pages
Title and Rationale Group 1: Leader: Hao, Alden Christian V. Assistant Leader: Genave, Cedrick D. Members
No ratings yet
Title and Rationale Group 1: Leader: Hao, Alden Christian V. Assistant Leader: Genave, Cedrick D. Members
2 pages
Grade Six by Topic NGSS EDITED
No ratings yet
Grade Six by Topic NGSS EDITED
18 pages
Qulitative Research Method Theory and Practice. Jakarta: Salemba Empat.)
No ratings yet
Qulitative Research Method Theory and Practice. Jakarta: Salemba Empat.)
3 pages
Mb0050 SLM Unit09
No ratings yet
Mb0050 SLM Unit09
32 pages
FA4 Revision 1 - Measurement
No ratings yet
FA4 Revision 1 - Measurement
5 pages
11 Sample Problems On Chi-Square Tests (Chapter 11) - ANSWER KEY
No ratings yet
11 Sample Problems On Chi-Square Tests (Chapter 11) - ANSWER KEY
24 pages
Stat 213 Chapter 7 2
No ratings yet
Stat 213 Chapter 7 2
18 pages
Ad1Pp5Xn5U: Homegroup Password
No ratings yet
Ad1Pp5Xn5U: Homegroup Password
1 page
Ranganathan and The Univer Se of Knowledge
No ratings yet
Ranganathan and The Univer Se of Knowledge
6 pages
Unit-2 - Human Resource Management in India
No ratings yet
Unit-2 - Human Resource Management in India
1 page
IIBM List of Electives
No ratings yet
IIBM List of Electives
1 page
Lecture 17 - Ch10 - ChiSquare Test
No ratings yet
Lecture 17 - Ch10 - ChiSquare Test
35 pages
Unit-5 - Training and Development: Q-1 What Is Training?
No ratings yet
Unit-5 - Training and Development: Q-1 What Is Training?
2 pages
FAQ/Unit - 4/human Resource Management
No ratings yet
FAQ/Unit - 4/human Resource Management
2 pages
Iibm Institute of Business Management: Diploma Courses
No ratings yet
Iibm Institute of Business Management: Diploma Courses
2 pages
Chi Square Test
No ratings yet
Chi Square Test
17 pages
ch11 Chi Square
No ratings yet
ch11 Chi Square
33 pages
Chi-Square Test
No ratings yet
Chi-Square Test
5 pages
CH 11
No ratings yet
CH 11
43 pages
Maths Report
No ratings yet
Maths Report
15 pages
Chapter 6. Chi-Square Test
No ratings yet
Chapter 6. Chi-Square Test
25 pages
Forces and Motion Lab Report
No ratings yet
Forces and Motion Lab Report
6 pages
Final Simulation Theory - BT
No ratings yet
Final Simulation Theory - BT
13 pages
Chi Square Method
No ratings yet
Chi Square Method
34 pages
Statistical Theory Lecture 5-2025
No ratings yet
Statistical Theory Lecture 5-2025
13 pages
CH 11
No ratings yet
CH 11
22 pages
Geography
No ratings yet
Geography
28 pages
DR S Pari
No ratings yet
DR S Pari
5 pages
Color Atlas of Forensic Medicine and Pathology Second Edition Official Test Bank
No ratings yet
Color Atlas of Forensic Medicine and Pathology Second Edition Official Test Bank
412 pages
Chi Squared for Beginners
From Everand
Chi Squared for Beginners
Stephanie Glen
No ratings yet
Elementary Statistics
From Everand
Elementary Statistics
jay prakash Maheshwari
5/5 (1)
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
From Everand
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
SUJAUL CHOWDHURY
No ratings yet

Mb0050 SLM Unit11

Uploaded by

Mb0050 SLM Unit11

Uploaded by

Research Methodology

Page No. 257

There are many applications of a chi-square test. Some of them mentioned

11.2 A Chi-square Test for the Goodness of Fit

Sikkim Manipal University

Page No. 258

Sikkim Manipal University

Page No. 259

H0 : Probabilities of the occurrence of events E1, E2, ..., Ek are given by

The computations for 32 are as under:

Sikkim Manipal University

Page No. 260

The computed value of chi-square is 4.323.

Rejection region for Example 11.1

Sikkim Manipal University

Page No. 261

Example 11.2: An insurance company provides auto insurance and is analysing

H1 : At least one of these proportions is incorrect.

Sikkim Manipal University

Page No. 262

The computation of sample chi-square value is given in the following table:

The value of sample

Rejection region for Example 11.2

Page No. 263

11.3 A Chi-square Test for Independence of Variables

First Classification Category

Sikkim Manipal University

Page No. 264

The degrees of freedom for the chi-square statistic are given by (r 1)

Ri = Total for the ith row

Let us consider a few examples:

Sikkim Manipal University

Page No. 265

The table of the observed and expected frequencies corresponding to

Page No. 266

Rejection region for Example 11.3

Sikkim Manipal University

The table of the observed and expected frequencies corresponding to

The sample chi-square is 2 =

The critical value of the chi-square with 2 degrees of freedom at 5 per

Sikkim Manipal University

Page No. 268

Rejection region for Example 11.4

Sikkim Manipal University

Page No. 269

11.4 A Chi-square Test for the Equality of More than

Page No. 270

p1 = Proportion of incorrect transaction for 1st client

An estimate of the combined proportion of the incorrect transactions under

p = 80 100 90 110 380 = 0.305

100 0.305 = 30.5 90 0.305 = 27.45

110 0.305 = 33.55

100 0.695 = 69.5 90 0.695 = 62.55

110 0.695 = 76.45

Page No. 271

(21 24.4)2 (25 30.5)2 (30 27.45)2 (40 33.55)2

(59 55.6)2 (75 69.5)2 (60 62.55)2 (70 76.45)2

= 0.474 + 0.992 + 0.237 + 1.240 + 0.208 + 0.435 + 0.104 + 0.544

11.5 Case Study

Page No. 272

Sikkim Manipal University

Page No. 273

Table 2 Cross-table of Preference for Fast Food with Gender

Table 3 Cross-tabulation of Preference for Fast Food with Income

Page No. 274

It can be used to test the goodness of fit of a distribution, independence

11.8 Terminal Questions

Page No. 275

Test an appropriate hypothesis using a 5 per cent level of significance.

Sikkim Manipal University

Page No. 276

Answers to Terminal Questions

Sikkim Manipal University

Page No. 277

You might also like