BIOSTAT23G
Dairdre Mar C. Sueta LPT
Adopted from the PPT of
Dr. Alfredo Alave
• Introduction to Biostatistics
• Descriptive Statistics
• Some Basic Probability Concepts
Course •
•
Probability Distributions
Some Important Sampling
Outline Distributions
• Hypothesis Testing
• Analysis of Variance
• Simple Linear Regression and
Correlation
Statistics and Biostatistics
Statistics
is a field of study concerned with (1) the collection, organization,
summarization, and analysis of data; and (2) the drawing of inferences about
a body of data when only a part of the data is observed.
Biostatistics
When the data analyzed are derived from the BIOLOGICAL SCIENCES
AND MEDICINE, we use the term biostatistics to distinguish this particular
application of statistical tools and concepts.
Epidemiology
The study of disease and its treatment, control, and prevention in
a population of individuals.
Whole populations may be examined, but…
More frequently, samples of the population may be examined.
Samples that are studied must be representative of the population for the
results to be generalized to the total population.
Data consists of information coming from observations, counts,
measurements, or responses.
Experiments.
Frequently the data needed to answer a question are available only as the result of
an experiment. A nurse may wish to know which of several strategies is best for maximizing
patient compliance. The nurse might conduct an experiment in which the different
strategies of motivating compliance are tried with different patients. Subsequent
evaluation of the responses to the different strategies might enable the nurse to decide
which is most effective.
Routinely kept records.
It is difficult to imagine any type of organization that does not keep
records of day-to-day transactions of its activities. Hospital medical records,
for example, contain immense amounts of information on patients, while
hospital accounting records contain a wealth of data on the facility’s business
activities. When the need for data arises, we should look for them first among
routinely kept records.
Surveys.
If the data needed to answer a question are not available from
routinely kept records, the logical source may be a survey. Suppose, for
example, that the administrator of a clinic wishes to obtain information
regarding the mode of transportation used by patients to visit the clinic. If
admission forms do not contain a question on mode of transportation, we
may conduct a survey among patients to obtain this information.
External sources.
The data needed to answer a question may already exist in the
form of published reports, commercially available data banks, or the research literature. In
other words, we may find that someone else has already asked the same question, and the
answer obtained may be applicable to our present situation.
A population is the collection of all outcomes, responses,
measurement, or counts that are of interest.
A sample is a subset of a population.
Populations & Samples
Example: Responses of all
In a recent survey, 3000 students at Union
college students at UNO-R College (population)
were asked if they smoked Responses of
cigarettes regularly. 350 of students in survey
the students said yes. (sample)
Identify the population and
the sample.
Parameters & Statistics
A parameter is a numerical description of a population
characteristic.
A statistic is a numerical description of a sample
characteristic.
Parameter Population
Statistic Sample
Instruction: Decide whether the numerical value describes a population parameter or a
sample statistic.
A recent survey of a randomly selected of 450 college students reported that the average
daily allowance for students is Php 325.
Because the average of Php 325 is based on a sample, this is a sample statistic.
In a random check of a sample of retail stores, the Food and Drug Administration found
that 34% of the stores were not storing fish at the proper temperature.
Because the percent of 34% is based on a subset of the population, it is a sample
statistic.
Branches of Statistics
The study of statistics has two major branches: descriptive statistics and inferential
statistics.
Statistics
Involves the organization,
summarization, and display of data.
Descriptive Inferential
statistics statistics
Involves using a sample to draw
conclusions about a population.
Descriptive and Inferential Statistics
Example:
A large sample of men, aged 48, was studied for 18 years. For
unmarried men, approximately 70% were alive at age 65. For
married men, 90% were alive at age 65.
Descriptive statistics involves statements such as “For unmarried men,
approximately 70% were alive at age 65” and “For married men, 90% were alive at
65.”
A possible inference drawn from the study is that being married is associated with
a longer life for men.
Variable
If, as we observe a characteristic, we find that it takes on different
values in different persons, places, or things, we label the characteristic a
variable.
QUANTITATIVE VARIABLE
We can obtain measurements on the heights of adult males, the weights of
preschool children, and the ages of patients seen in a dental clinic. These are examples
of quantitative variables.
Variable
QUALITATIVE VARIABLES
When an ill person is given a medical diagnosis, a person is designated as
belonging to an ethnic group, or a person, place, or object is said to possess or not to
possess some characteristic of interest.
We refer to variables of this kind as qualitative variables. Measurements made on
qualitative variables convey information regarding attribute.
Classify the following as Quantitative
or Qualitative data:
1. Color of the eye
2. Number of computers in the room
3. Civil status
4. Address
5. Cellular phone number
Random Variable
When the values obtained arise as a result of chance factors, so that
they cannot be exactly predicted in advance, the variable is called a random
variable. An example of a random variable is adult height. Attained adult
height is the result of numerous genetic and environmental factors
Discrete Random
A discrete variable is characterized by gaps or interruptions in the
values that it can assume. These gaps or interruptions indicate the absence of
values between particular values that the variable can assume.
Continuous Random Variable
A continuous random variable does not possess the gaps or
interruptions characteristic of a discrete random variable. A continuous
random variable can assume any value within a specified relevant interval of
values assumed by the variable.
Classify the following as Discrete or
Continuous:
1. Weight of a body
2. Length of a rod
3. Number of chairs in the room
4. Dimension of a table
5. Number of possible outcomes in
throwing a die
Levels of Measurement
The level of measurement determines which statistical calculations are
meaningful. The four levels of measurement are: nominal, ordinal, interval, and
ratio.
Nominal
Lowest to
Levels of Ordinal
highest
Measurement Interval
Ratio
Levels of Measurement
Data at the nominal level of measurement are qualitative only.
Nominal
Levels of Calculated using names, labels, or
Measurement qualities. No mathematical
computations can be made at this
level.
Levels of Measurement
Data at the ordinal level of measurement applies to data that
can be arranged in order. However, differences between data
values either cannot be deter- mined or are meaningless.
Levels of
Measurement Ordinal
Arranged in order, but
differences between data
entries are not meaningful.
Levels of Measurement
Data at the interval level of measurement are quantitative. A zero entry
simply represents a position on a scale; the entry is not an inherent
zero.
Levels of
Measurement
Interval
Arranged in order, the
differences between data
entries can be calculated.
Levels of Measurement
Data at the ratio level of measurement are similar to the
interval level, but a zero entry is meaningful.
A ratio of two data values can be formed
Levels of
so one data value can be expressed as a
Measurement
ratio.
Ratio
Summary of Level of Measurements
▪ Nominal - categories only
▪ Ordinal - categories with some order
▪ Interval - differences but no natural starting point
▪ Ratio - differences and a natural starting point
Summary of Level of Measurements
1. The senator’s name is Sam Wilson
2. The senator is 58 years old
3. The years in which the senator was elected to the
Senate are 1992, 1998, and 2004.
4. The senator’s total taxable income last year was
$878,314.
Designing a Statistical Study
GUIDELINES
1. Identify the variable(s) of interest (the focus) and the population of the study.
2. Develop a detailed plan for collecting data. If you use a sample, make sure the
sample is representative of the population.
3. Collect the data.
4. Describe the data.
5. Interpret the data and make decisions about the population using inferential
statistics.
6. Identify any possible errors.
PROBABILITY SAMPLING
Definitions
members of the population are selected in
such a way that each individual member has
an equal chance of being selected
❖(of size n)
subjects selected in such a way that every
possible sample of the same size n has the
same chance of being chosen
Copyright © 2004 Pearson
Education, Inc.
Simple Random Sampling
selection so that each has an
equal chance of being selected
• Lottery
• Fish bowl technique
Copyright © 2004 Pearson
Education, Inc.
Systematic Sampling
Select some starting point and then
select every K th element in the population
Copyright © 2004 Pearson
Education, Inc.
Stratified Sampling
subdivide the population into at
least two different subgroups that share the same
characteristics, then draw a sample from each
subgroup (or stratum)
Copyright © 2004 Pearson
Education, Inc.
Strata Number of Families Percentage Number families
High Income 1000 1000/5000 = .2 .2(200) = 40
Average Income 2500 2500/5000 = .5 .5(200) = 100
Low Income 1500 1500/5000 = .3 .3(200) = 60
Total 5000 200
Copyright © 2004 Pearson
Education, Inc.
Cluster Sampling
divide the population into sections
(or clusters); randomly select some of those clusters;
choose all members from selected clusters
Copyright © 2004 Pearson
Education, Inc.
MULTI-STAGE SAMPLING
• Also called multi-stage cluster sampling
• Used to overcome problems associated with a geographically
dispersed population when face-to-face contact is needed or
where it is expensive and time consuming to construct a
sampling frame for a large geographical area
MULTI-STAGE SAMPLING
NonPROBABILITY SAMPLING
Convenience/ Accidental
/Incidental
Snowball Sampling
Purposive Sampling
Quota Sampling