INFERENTIAL
STATISTICS &
HYPOTHESIS TESTING
After this session you should be able
to
• Discuss the concept and meaning of inferential statistics
• Describe the inferential statistics procedures
• Describe the procedures for testing hypothesis
• Recognize the difference of type 1 & type 2 errors
• Describe the inferential statistical methods
INTRODUCTION
• What does it mean when the mean • Descriptive statistics
score of students in a class is 60%
• Summarize a distribution of given
variables in a given sample /
• What does it mean when 14% of population
people who smoke cigarettes have • Provide statistics of location
small cell lung cancer? ( measures of central tendency )
• Provide statistics of dispersion
( variance, range and standard
deviation )
• A researcher wanted to understand • Inferential Statistics
whether a significant difference exists • Drawing inferences for the
on COVID-19 vaccine uptake of males population from a sample
and females medical students . A
sample of 100 males and 100 females • involves estimation (i.e., guessing
were selected to participate in the study the characteristics of a population
from a sample of the population)
• Different statistical techniques, that and hypothesis testing (i.e.,
necessarily fall under Inferential finding evidence for or against an
statistics can be used both parametric explanation or theory)
& non-parametric tests
• Parametric tests ( assumes
population data are normally
distributed) include:
• t-test, ANOVA, chi-square, binary
logistic regression, correlation,
Fishers test
• Non-parametric tests ( distribution
free assumptions) include :
• Mann Whitney U Test, Sign Test,
Wilcoxon Signed-Rank Test, Kruskal
Wallis Test.
Inferential statistics
procedures: Parameter estimation
• inferences are drawn from sample that is representative of a population
and these inferences can then be generalized to the whole population
• In these inferences, the researcher will make an estimation that needs
to be close to the actual or true population value.
• Point estimation:
• This is a type of estimation in which the value is a single point. For example
the estimation for sample mean is made as µ that is expected to e equal to
the population mean.
• Point estimate comprises of sample mean and sample proportion. The
population mean is µ „ ‟ the sample mean will e „x ‟.
• In similar manner, if the population proportion is „P’ then sample proportion
will be „p’.
• Interval estimation:
• An interval estimate is an interval or two numbers within which the population
parameter could lie. Thus, for population mean µ „ ‟ the interval estimate will
be a<x< b.
• The interval estimate is greater than „a‟ but lesser than „b ‟. For example, an
interval estimate could be 45- 47 within which it is expected that the
population mean will lie.
• As the researcher has an interval, he/ she is thus able to trust that the estimate
is close to the population value with 95% or 99% level of confidence.
• Interval estimate comprises of confidence interval for mean and confidence
interval for proportions.
Inferential statistics procedures
: Hypothesis testing
What is a hypothesis?
• A formal statement of research question is often called the hypothesis
• The hypothesis should be stated in a way such that a “true” or “false”
answers from an experiment would support or refute the hypothesis
• Hypothesis are used to state the relationship between two variables and may
be stated as:
1) Null hypothesis ( assumption: No relationship between variables)
eg. There is no relationship between depression and HIV/AIDS infection
status
2) Alternative Hypothesis ( assumption : There is a relationship between variables)
• eg. There will be a difference by HIV/AIDS status in depression scale scores.
• Differences between null hypothesis and alternative hypothesis
• 1) equality vs inequality
• 2) Alternative hypothesis refers to the sample and the null hypothesis refers to
the population.
• 2) Null hypothesis is to be tested indirectly and the alternative hypothesis is to
be tested directly. This is because we make inferences about the population
based on the sample.
• 3) The fourth difference is quite interesting, as the alternative hypothesis are
usually written using Roman symbols, whereas null hypothesis are written using
Greek symbols.
• µ (‘mu’ is a symbol for
parameter mean whereas x
is a symbol for sample
mean.
• 4) The alternative hypothesis is an explicit hypothesis, whereas, the
null hypothesis is an implied hypothesis, mainly because it cannot be
directly tested
Characteristics of a good hypothesis
• 1. The statement of hypothesis is not stated as a question but is in a
declarative form.
• 2. It states a relationship that is expected between the given variables.
• 3. The theory or literature on the basis of which the hypothesis is formulated
is reflected in that hypothesis
• 4.The hypothesis needs to be clear, to the point as well as brief.
• 5. It needs to be possible to test the hypothesis
Type I and type II errors
• Type I error
• Rejecting a true null hypothesis
• Type II error
• Accepting a false null hypothesis
Inferential statistical methods
• Measurement of variables
• Continuous Vs categorical variables
• Statistical methods
• Univariate- single variable , descriptive ( type of a
variable: 1) categorical variables: frequency, tables,
graphs
• 2) continous variable : measures of central
tendency and dispersion
• Bivariate: relationship between two variables ( t-test,
ANOVA, chi-square, binary logistic regression,
correlation, Fishers test)
• Multivaiate analysis ( multiple correlation, regression-
multivatiate, multinomial , Factor analysis MANOVA
etc)
Group exercise ( work with your neighbour)
Which tests can be used in finding association
between
• Categorical vs continous : ?=
• Categorical vs categorical: ?=
• Continous vs continous :?=
• Thank you