INFERENTI
AL
S TAT I S T I C
TOPICS
• Sampling Methods and
Distribution
• Sampling Error
⚬ Central Limit
Theorem
⚬ Confidence
Intervals
INFERENTIAL
S TAT I S T I C S
inferential statistics allow us to make statistical inferences about
the broader population from which a random sample is drawn.
The ability to make inferences about a population based on a
sample is a primary goal of statistics, and this chapter will focus
on the application of inferential statistics.
INFERENTIAL
S TAT I S T I C S
Inferential statistics involves the use of a sample (1) to estimate some
characteristic in a large population; and (2) to test a research hypothesis
about a given population. To appropriately estimate a
population characteristic, or parameter, a random and unbiased sample must
be drawn from the population of interest. The terms population and sample
refer to different things in statistics and are described using different
terminology. For instance, a population is defined as the full collection of
individuals from which a sample (i.e., a subcollection of individuals) is drawn.
A population is described by parameters, or the measured characteristics that
describe a population.
SAMPLING
TWO M E TOF
TYPES HSAMPLING
OD METHOD
• Probability Sampling - involves random selection,
allowing you to make strong statistical inferences about
the whole group
• Non-Probability Sampling - involves non-random
selection based on convenience or other criteria,
allowing you to easily collect data.
SAMPLING
M E T H O D NON-PROBABILITY
PROBABILITY
SAMPLING
SAMPLING
• Simple Random Sampling
• Convenience Sampling
• Systematic Sampling
• Quota Sampling
• Stratefied Random
• Purposive Sampling
Sampling
• Snowball Sampling
• Clustered Sampling
PROBABILITY SAMPLING
METHOD
Simple Random Sampling (SRS): a reliable method of obtaining
information where every single member of a population is chosen randomly,
merely by chance. Each person has the same probability of being chosen to be
a part of a sample. An SRS of size n individuals from the population is chosen
in such a way that every set of n individuals has an equal chance to be the
sample actually selected. A specific advantage of simple random sampling is
that it is the most straightforward method of probability sampling. A
disadvantage is that you may not find enough individuals with your
characteristic of interest, especially if that characteristic is uncommon.
PROBABILITY SAMPLING
METHOD
PROBABILITY SAMPLING
METHOD
Systematic Sampling: a method where sample members of a
population are chosen at regular intervals. It requires selecting a
starting point for the sample and where sample size determination can
be repeated at regular intervals. Systematic sampling is often more
convenient than simple random sampling as it is easy to administer.
PROBABILITY SAMPLING
METHOD
PROBABILITY SAMPLING
METHOD
Stratified Random Sampling: a method that divides the target
population into smaller groups that do not overlap but represent the
entire population. The selected sample from different strata is
combined to have a single sample.
PROBABILITY SAMPLING
METHOD
PROBABILITY SAMPLING
METHOD
Cluster Sampling: a method where statisticians divide the entire population into
clusters or sections representing a population. Demographic characteristics, such as
race/ethnicity, gender, age, and zip code can be used to identify a cluster. Cluster
sampling can be more efficient than simple random sampling, especially where a
study takes place over a wide geographic region.
An extended version of cluster sampling is Multi-stage Sampling, where, in the first
stage, the population is divided into clusters, and clusters are selected. At each
subsequent stage, the selected clusters are further divided into smaller clusters. The
process is completed until you get to the last step, where some members of each
cluster are selected for the sample. Multi-stage sampling involves a combination of
cluster and stratified sampling.
PROBABILITY SAMPLING
METHOD
NON-PROBABILITY
SAMPLING METHOD
In sampling with non-probability, non-randomized methods are
used. Participants are chosen since they are easy to access in
place of randomization. The limitation of this method is that the
results are not generalizable to the population but are relevant
primarily to that particular group sampled.
NON-PROBABILITY
S A M P L I N G M E T H O
• Convenience Sampling: easiest method of sampling, because
D
participants are selected based on availability and willingness to take
part.
• Quota Sampling: Interviewers are given a quota of subjects of a
specified type to attempt to recruit.
• Purposive Sampling: also known as selective or subjective sampling.
This technique relies on the judgement of the researcher when
choosing who to ask to participate.
• Snowball Sampling: This method is commonly used in social
sciences when investigating hard to reach groups. Existing subjects are
asked ti nominate further subjects known to them, so the sample
NON-PROBABILITY
SAMPLING METHOD
SAMPLING
DISTRIBUTION OF
SAMPLE MEANS
Given the presence of sampling error, one may wonder if it is ever
possible to generalize from a sample to a larger population. The
theoretical model known as the sampling distribution of means has
certain properties that give it an important role in the sampling
process. Levin and Fox 2006[8] point out these characteristics:
SAMPLING
DISTRIBUTION OF
SAMPLE MEANS
1.The sampling distribution of means approximates a normal curve.
The sampling distribution of means is the probability distribution of
a sample statistic that is formed when random samples of size n
are repeatedly taken from a population (Larson & Farber 2019)[9].
If the raw data are normally distributed, then the distribution of
sample means is normal regardless of sample size. Every sample
statistic has a sampling distribution.
SAMPLING
DISTRIBUTION OF
SAMPLE MEANS
2. The mean of a sampling distribution of means (the mean of means) is
equal to the true population mean. They are regarded as
interchangeable values.
SAMPLING
DISTRIBUTION OF
SAMPLE MEANS
The standard deviation of a sampling distribution of means is smaller
than the standard deviation of the population.
SAMPLING
DISTRIBUTION OF
SAMPLE MEANS Example:
Consider the population consisting of the values (2, 4, 6)
a. List all the sample size of 2 with replacement
b. compute the mean of each sample
c. identify the probability each sample
d. compute the mean of the sampling distribution of the mean
e. compute the population mean
f. compare the population mean with the mean of the sampling
distribution of means
SAMPLING
DISTRIBUTION
2 1/9 2/9
3 2/9 6/9
n=9
4 3/9 12/9 ADD
5 2/9 10/9
6 1/9 6/9
M=)
M=4
= 36/9 or 4
M=
=
wherein,
M = mean of the sampling distribution
= frequency of each sample mean
= sample mean
n = total number if population
SAMPLING
DISTRIBUTION
2 1/9 2/9
3 2/9 6/9
4 3/9 12/9 ADD
5 2/9 10/9
6 1/9 6/9
M=) = 36/9 or 4
wherein,
M = mean of the sampling distribution
= frequency of each sample mean
= sample mean
n = total number if population
CENTRAL LIMIT
What is Central THEOREM
Limit Theorem?
The central limit theorem says that the sampling
distribution of the mean will always be normally
distributed, as long as the sample size is large
enough. Regardless of whether the population has a
normal, Poisson, binomial, or any other distribution,
the sampling distribution of the mean will be normal.
Consider that there are 15 sections in the science
department of a university, and each section hosts
around 100 students. Our task is to calculate the
average weight of students in the science
department.
Steps in calculating the average
• First, measure the weights of all the students in the science
department.
• Add all the weights.
• Finally, divide the total sum of weights by the total number of
students to get the average.
Sample population
Central Limit Theorem Formula
Distribution of the Variable in the
Population
Normal: It is also known as the Gaussian distribution. It is symmetric
about the mean, showing that data near the mean are more frequent
in occurrence than data far from the mean.
Right-Skewed: It is also known as the positively skewed. Most of the
data lie to the right/positive side of the graph peak.
Left-Skewed: Most of the data lies on the left side of the graph at its
peak than on its right.
Uniform: It is a condition when the data is equally distributed across
the graph.
This part of the definition refers to the distribution of the variable’s
values in the population from which you draw a random sample.
Conditions of the Central Limit Theorem
• The sample size is sufficiently large. This
condition is usually met if the size of the sample
is n ≥ 30.
• The samples are independent and identically
distributed, i.e., random variables. The
sampling should be random.
• The population’s distribution has a finite
variance. The central limit theorem doesn’t apply
to distributions with infinite variance.
Significance of the Central
Limit Theorem
Practical Applications of CLT
Assumptions Behind the
Central Limit Theorem
•The data must follow the randomization
condition. It must be sampled randomly
•Samples should be independent of each
other. One sample should not influence the other
samples
•Sample size should be not more than 10%
of the population when sampling is done
without replacement
Assumptions Behind the
Central Limit Theorem
•The sample size should be sufficiently large.
Now, how will we figure out how large this size
should be? Well, it depends on the population.
When the population is skewed or asymmetric, the
sample size should be large. If the population is
symmetric, then we can draw small samples as
well.
Confidence
Intervals
Introduction
Confidence interval CIs are the fundamental concept in
inferential statistics, providing a range of values Within w/c a
population parameter is expected to lie. They account for sampling
variability, acknowledging that point estimates may not reflect the
true population values accurately. A CI is defined by an upper and
lower bound, calculated using the sample data and a specified
confidence level (e.g., 95%). This level indicates the probability
that the interval contains the true parameter if the sampling were
multiple time.
Definition and
importance of
confidence intervals
A confidence interval is a statistical range that estimates where a
population parameter (like a mean) is likely to fall, based on sample data.
Typically expressed with a confidence level examples 95%, or 99%, it
quantifies uncertainty & helps assess the reliability of estimates.
Importance:
• Statistical Inference: aids in making conclusions about population
from samples.
• Decision- Making: Informs choices in research, business and healthcare
by indicating the precision of estimates
• Significance Assessment: Helps determine if result are statistically
Calculation Method
Interpretation
Construct Confidence
Intervals
Thank
You!!!