0% found this document useful (0 votes)
79 views29 pages

Hypothesis Testing in Biostatistics

This document discusses various statistical tests used to analyze data, including z-tests, t-tests, and chi-square tests. It explains how to formulate hypotheses, calculate test statistics, determine decision rules, and make conclusions about population parameters based on sample data for these tests. Examples are provided to demonstrate how to apply these statistical tests to compare means and test for differences between groups. References for further reading on basic statistical concepts and methodology are also included.

Uploaded by

Eugine Balomaga
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views29 pages

Hypothesis Testing in Biostatistics

This document discusses various statistical tests used to analyze data, including z-tests, t-tests, and chi-square tests. It explains how to formulate hypotheses, calculate test statistics, determine decision rules, and make conclusions about population parameters based on sample data for these tests. Examples are provided to demonstrate how to apply these statistical tests to compare means and test for differences between groups. References for further reading on basic statistical concepts and methodology are also included.

Uploaded by

Eugine Balomaga
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

B.Sc. (Hons.

) Biotechnology
Core Course 13:
Basics of Bioinformatics and
Biostatistics (BIOT 3013 )
Unit 3:
Test of significance: Z-test,
t-test and Chi-square test
Dr. Satarudra Prakash Singh
Department of Biotechnology
Mahatma Gandhi Central University,
Motihari
Test of significance
• It is used in estimating population parameters
using sample data.
• For example, an administrator of a big
hospital is interested in the knowing the mean
age of patients admitted during the last year.
• The administrator draw a random sample of
size n from the patient population and
compute the average x , which he use as a
point estimate of µ.
Test of significance
• Because random sampling involves chance,
then it can’t be expected to be equal to µ.
• The value of may x be greater than or less
than µ.
• Statistical inference (test of significance) is the
method by which we can reach to a
conclusion about a population on the basis of
the sample information drawn from the same
population.
Hypotheses testing
about population parameters using sample statistics

• It is a statement about one or more


populations .
• For example, a hospital administrator may
want to test the hypothesis that the average
length of stay of patients admitted to the
hospital is 5 days.
Hypothesis testing
• There are two hypotheses involved in
hypothesis testing
1. Null hypothesis H0: It is the hypothesis to be
tested .
2. Alternative hypothesis HA : It is a statement
of what we believe is true if our sample data
provided reason to reject the null hypothesis.
Hypothesis Testing steps about the
mean of a population
1.Data collection: find out the determine
variable, sample size (n), sample mean( x ),
population standard deviation (σ) or sample
standard deviation (s) if they are unknown .
2. Assumptions : Now we have two cases:
• Case1: Population is normally distributed with
known or unknown variance (sample size n may
be small or large),
• Case2: Population is not normal with known or
unknown variance (n is large i.e. n≥30).
• 3.Hypotheses: we have to test three cases
• Case I: we want to test that the population
mean is different than 50.
H0: μ=μ0
HA: μ ≠ μ0
• Case II : we want to test that the population
mean is greater than 50.
H0: μ = μ0
HA: μ > μ0
• Case III : we want to test that the population mean is
less than 50.
H0: μ = μ0
HA: μ< μ0
4.Test Statistic:
• Case 1: Population is normal distributed.

σ2 is known σ2 is unknown
( n large or small)
n large n small
X - µo
Z = X - µo
σ Z =
X - µo t =
n s s
n n
• Case2: If population is not normal and n is large.
i)If σ2 is known ii) If σ2 is unknown
X - µo X - µo
Z = Z =
σ s
n n

Text Book : Basic Concepts and


9
Methodology for the Health Sciences
5.Decision Rule on the basis of level of significance (α)
i) If HA: μ≠ μ0
Reject H 0 if Z >Z1-α/2 or Z< - Z1-α/2 (when use Z - test)
or Reject H 0 if T >t1-α/2,n-1 or T< - t1-α/2,n-1 (when use t-
test)
ii) If HA: μ> μ0
Reject H0 if Z>Z1-α (when use Z - test)
or Reject H0 if T>t1-α,n-1 (when use t - test)
iii) If HA: μ< μ0
Reject H0 if Z< - Z1-α (when use Z - test)
Or Reject H0 if T<- t1-α,n-1 (when use t - test)
6. Decision: If we reject H0, we can conclude that HA is
10
true.
7. An alternative decision rule can be applied
using the p- value.

i) If the p-value is less than or equal to α ,we


reject the null hypothesis (p ≤ α).

ii) If the p-value is greater than α ,we do not


reject the null hypothesis (p > α).
Example 1
• Suppose a researcher is interested in the mean
age of a certain population. A random sample
of 10 individuals drawn from the target
population that has a mean of 27 years.
Assuming the population is normally
distributed with variance of 20. Can we
conclude that the mean is different from 30
years? (α=0.05) . If the p-value is 0.0340, how
can we use it in making a decision?
Solution

1-Data: variable is age, n=10,


x=27,σ2=20, α=0.05
2-Hypotheses: H0 : μ=30; HA: μ ≠ 30
3-Test Statistic: Z = X - µ = 27 − 30
o
σ 20
Z calculated = -2.12 n
10
4.Decision Rule: The alternative hypothesis is true
HA: μ≠ 30; hence we reject H0.
if Z cal. >Z1-0.025/2= Z0.975
or Z cal. < - Z1-0.025/2= - Z0.975
• Z0.975=1.96(from standard table A1) 13
Can we conclude that μ<30
Decision Rule: Reject H0 if Z< Z α, where
Z α= -1.645 (from table A1 at α= 5% ).

Decision: Thus, we can conclude that the


population mean is smaller than 30.
Example 2
• Among 157 African-American men, the
sample mean systolic blood pressure was 146
mm Hg with a standard deviation of 27.
Assuming the population distribution is not
normal, can we conclude that the mean
systolic blood pressure for a population of
African-American is greater than 140 mm Hg
at α=0.01.
Solution
1. Data: Variable is systolic blood pressure,
n=157 , sample mean=146, s=27, α=0.01.

2. Hypotheses: H0 :μ=140; HA: μ>140

3.Test Statistic:
X -µ 146−140 6
• Z = s = 27 =
o

2.1548
= 2.78
n 157

Text Book : Basic Concepts and


16
Methodology for the Health Sciences
4. Desicion Rule:
we reject H0 if Z>Z1-α
= Z0.99= 2.33
(from table A1)

5. Decision: Hence, we can conclude that the


mean systolic blood pressure for a population
of African-American is greater than 140 mm
Hg.

Text Book : Basic Concepts and


17
Methodology for the Health Sciences
Student's t-test
• It is used to test the null hypothesis that there
is no difference between the means of the two
groups.
• There are three cases:
i) one-sample t-test : To test if a sample mean
(as an estimate of a population mean) differs
significantly from a given population mean.
The formula for one sample t-test is =(x- u)/SE
Where x = sample mean, u = population mean
and SE = standard error of mean
ii) The unpaired t-test
To test if the population means estimated by
two independent samples differ significantly.

The formula for unpaired t-test is: t=(X1-X2)/SE

where X1-X2 is the difference between the


means of the two groups and SE denotes the
standard error of the difference.
iii) The paired t-test
To test if the population means estimated by
two dependent samples differ significantly.
Usually, it is used when measurements are
made on the same subjects before and after a
treatment.

The formula for paired t-test is: d/SE

where d is the mean difference and SE


denotes the standard error of this difference.
Chi-square-test
It is used to analyze the categorical data.

It compares the frequencies and tests whether the


observed data differ significantly from the expected
data if there were no differences between groups (H0).

It is calculated by the sum of the squared difference


between observed (O) and the expected (E) data
divided by the expected (E) data.
The Decision Rule
• The quantity χ-square will be small if the
observed and expected frequencies are close
together and will be large if the differences
are large.
• The computed value of χ-square is compared
with the tabulated value with degrees of
freedom = (r-1)(c-1) where r is the number of
rows and c is the number of columns.
• Reject H0, if χ-square is greater than or equal
to the tabulated χ-square for the chosen value
of α.
References
• Biostatistics: Basic Concepts and
Methodology for the Health Sciences, 10ed,
ISV. Wayne W. Daniel, Chad L. Cross. ISBN:
9788126551897. 954 pages.
• Ali Z, Bhaskar SB. Basic statistical tools in
research and data analysis. Indian J Anaesth.
2016 Sep;60(9):662-669. doi: 10.4103/0019-
5049.190623. Erratum in: Indian J Anaesth.
2016 Oct;60(10 ):790. PMID: 27729694;
PMCID: PMC5037948.
Thank you.

Email: [email protected]

You might also like