Sample Notes
Sample Notes
o “Scientific research is systematic, controlled, empirical, and critical investigation of natural phenomena
guided by theory and hypotheses about the presumed relations among such phenomena.”
–Kerlinger, 1986
o Instrumentation, sampling
o Data analysis
STUDYPOPULATION
Population
SAMPL
o A population can be defined as including all people or items with the characteristic one wishes to understand.
E
o Because there is very rarely enough time or money to gather information from everyone or everything in a
population, the goal becomes finding a representative sample (or subset) of that population.
o
TARGET POPULATION
Note also that the population from which the sample is drawn may not be the same as the population about
which we actually want information. Often there is large but not complete overlap between these two groups
due to frame issues etc
o Sometimes they may be entirely separate - for instance, we might study rats in order to get a better
understanding of human health, or we might studyrecordsfrompeoplebornin2008inorderto make predictions
about people born in 2009.
Sampling frame
o In the most straightforward case, such as the sentencing of a batch of material from production (acceptance
sampling by lots), it is possible to identify and measure every single item in the population and to include any
one of them in our sample. However, in the more general case this is not possible. There is no way to identify
all rats in the set of all rats. Where noting is not compulsory, there is no way to identify which people will
actually vote at a forthcoming election (in advance of the election)
o As a remedy, we seek a sampling frame which has the property that we can identify every single element and
include any in our sample .
Sampling
Can you sample the entire population? 3 factors that influence sample representative- ness
▪ Sampling procedure
▪ Sample size
▪ Participation(response)
When might you sample the entire population?
● When your population is very small, When you have extensive resources when you don’t expect a very high
response
Process
o Specifying a sampling method for selecting items or events from the frame
Probability Sampling
o A probability sampling scheme is one in which every unit in the population has a chance (greater than zero)
of being selected in the sample, and this probability can be accurately determined.
o When every element in the population does have the same probability of selection, this is known as an “equal
probability of selection” (EPS) design. Such designs are also referred to as 'self-weighting' because all
sample d units are given the same weight.
o Each element in the population has a known and equal probability of selection.
o Each possible sample of a given size(n)has a known and equal probability of being the sample actually
selected.
o This implies that every element is selected independently of every other element.
Systematic Sampling
o The sample is chosen by selecting a random starting point and then picking every it h element in succession
from the sampling frame.
o The sampling interval, i, is determined by dividing the population size N by the sample size n and rounding to
the nearest integer.
o When the ordering of the elements is related to the characteristic of interest, systematic sampling increases the
representativeness of the sample.
o If the ordering of the elements produces a cyclical pattern, systematic sampling may decrease the
representativeness of the sample.
For example, there are100,000 elements in the population and a sample of 1,000 is desired. In this case the sampling
interval, i, is 100.Arandom numberbetween1and 100 is selected. If, for example, this numberis 23, the sample
consists of elements 23, 123, 223, 323, 423, 523, and soon.
Stratified Sampling
o The strata should be mutually exclusive and collectively exhaustive in that every population element should
be assigned to one and only one stratum and no population elements should be omitted.
o Next, elements are selected from each stratum by a random procedure, usually SRS.
o A major objective of stratified sampling is to increase precision without increasing cost.
Stratified Sampling
o The element swith in a stratum should be as homogeneous as possible, but the elements in different strata
should be as heterogeneous as possible.
o The stratification variables should also be closely related to the characteristic of interest.
o Finally, the variables should decrease the cost of the stratification process by being easy to measure and apply.
o In proportionate stratified sampling, the size of the sample drawn from each stratum is proportionate to the
relative size of that stratum in the total population.
o In disproportionate stratified sampling, the size of the sample from each stratum is proportionate to the
relative size of that stratum and to the standard deviation of the distribution of the characteristic of interest
among all the elements in that stratum.
Post Stratification
o Stratification is sometimes introduced after the sampling phase in a process called "post-stratification“.
o This approach is typically implemented due to a lack of prior knowledge of an appropriate stratifying
variable or when the experimenter lacks the necessary information to create a stratifying variable during the
sampling phase.
o Although the method is susceptible to the pitfalls of post hoc approaches, it can provide several benefits in
the right situation. Implementation usually follows a simple random sample. In addition to allowing for
stratification on an ancillary variable, post- stratification can be used to implement weighting, which can
improve the precision of a sample's estimates.
Cluster Sampling
o Identification of clusters
o List all cities, towns, villages &wards of cities with their population falling in target area under study.
o Calculate cumulative population & divide by30, this gives sampling interval.
o Select a random no. less than or equal to sampling interval having same no. of digits. This forms 1st cluster.
Cluster Sampling
One- stage sampling All of the elements within selected clusters are included in the sample.
Two-stage sampling A subset of elements within selected clusters are randomly selected for inclusion in the sample.
Multistage Sampling
o All ultimate units (houses, for instance) selected at last step are surveyed.
o This technique, is essentially the process of taking random samples of preceding random samples.
o Not as effective as true random sampling, but probably solves more of the problems inherent to
random sampling.
o An effective strategy because It bank son multiple randomizations. As such, extremely useful.
o Multistage sampling used frequently when a complete list of all members of the population not exists
and is inappropriate.
o Moreover, by avoiding the use of all sample units in all selected clusters, multistage sampling avoids
the large, and perhaps unnecessary, costs associated with traditional cluster sampling.
Difference between Strata and Clusters
o Although strata and clusters are both non- over lapping subsets of the population, they differ in
several ways.
o All strata are represented in the sample; but only a subset of clusters are in the sample.
o With stratified sampling, the best survey results occur when elements within strata are internally
homogeneous. However, with cluster sampling, the best results occur when elements within clusters
are internally heterogeneous
o Any sampling method where some elements of population have no chance of selection (these are
sometimes referred to as 'out of coverage'/'under covered'), or where the probability of selection can't
be accurately determined. It involves the selection of elements based on assumptions regarding the
population of interest, which forms the criteria for selection. Hence, because the selection of elements
is nonrandom, non probability sampling not allows the estimation of sampling errors..
Example: We visit every household in a given street, and interview the first person to answer the door. In any
household with more than one occupant, this is a non probability sample, because some people are more likely to
answer the door (e.g. an unemployed person who spends most of their time at home is more likely to answer than an
employed house mate who might be at work when the interviewer calls) and it's not practical to calculate these
probabilities
Convenience/Purposive Sampling
Convenience sampling attempts to obtain a sample of convenient elements. Often, respondents are selected
because they happen to be in the right place at the right time.
Judgmental Sampling
Judgmental sampling is a form of convenience sampling in which the population elements are selected based on the
judgment of the researcher.
o Test markets
o Bell wether precincts selected in voting behavior research expert witnesses used in court
Quota Sampling
o The first stage consists of developing control categories, or quotas, of population elements.
o In these cond stage, sample elements are selected based on convenience or judgment.
Population Sample
composition composition
Control Percentage Percentage Number
Characteristic Sex
Male 48 48 480
Female 52 52 520
Snowball Sampling
o After being interviewed, these respondents are asked to identify others who belong to the
target population of interest.
Non-probability Sampling Least expensive, least time Selection bias, sample not
Convenience sampling representative, not recommended
consuming, most convenient
for descriptive or causal research
Judgmental sampling Low cost, convenient, not time Does not allow generalization,
consuming subjective
Quota sampling Sample can be controlled for certain Selection bias no assurance of
characteristics representativeness
Probability sampling Easily understood results projectable Difficult to construct sampling frame
expensive
Simple random sampling (SRS)
Cluster sampling Easy to implement, cost effective Imprecise, difficult to compute and
interpret results