Introduction to the field and provides
a survey of data and data types.
Chapter 1
Objectives
Demonstrate knowledge of all
statistical terms.
Differentiate between the two
branches of statistics.
Identify types of data.
Objectives
Identify the measurement level
for each variable.
Identify the four basic sampling
techniques.
• Statistics is the science whereby inferences are made about specific random
phenomena on the basis of relatively limited sample material.
• The field of statistics has two main areas: mathematical statistics and applied
statistics.
1- Mathematical statistics concerns the development of new methods of
statistical inference and requires detailed knowledge of abstract mathematics
for its implementation
2- Applied statistics involves applying the methods of mathematical statistics to
specific subject areas, such as economics, psychology, and public health.
Biostatistics is the branch of applied statistics that applies statistical methods to
medical and biological problems.
What is Statistics?
“Statistics is a way to get information from data”
Statistics
Data Information
Data: Facts, especially Information: Knowledge
numerical facts, collected communicated concerning
together for reference or some particular fact.
information.
Statistics is a tool for creating new understanding from a set of
numbers.
Key Statistical Concepts…
Population
— a population is the group of all items of interest to a
statistics practitioner.
— frequently very large; sometimes infinite.
E.g. All 5 million Florida voters, per Example 12.5
Sample
— A sample is a set of data drawn from the population.
— Potentially very large, but less than the population.
E.g. a sample of 765 voters exit polled on election day.
Key Statistical Concepts…
Parameter
— A descriptive measure of a population.
Statistic
— A descriptive measure of a sample.
1.7
Key Statistical Concepts…
Population Sample
Subset
Statistic
Parameter
Populations have Parameters,
Samples have Statistics.
1.8
Types of Statistics
1- Descriptive statistics consists of
the collection, organization,
summation, and presentation of data.
Types of Statistics
2- Inferential statistics consists of
generalizing from samples to
populations, performing hypothesis
testing, determining relationships
among variables, and making
predictions.
Descriptive Statistics…
…are methods of organizing, summarizing, and presenting
data in a convenient and informative way. These methods
include:
Graphical Techniques
Numerical Techniques
The actual method used depends on what information we would
like to extract. Are we interested in…
• measure(s) of central location? and/or
• measure(s) of variability (dispersion)?
Descriptive Statistics helps to answer these questions…
1.11
Statistical Inference…
Statistical inference is the process of making an estimate,
prediction, or decision about a population based on a sample.
Population
Sample
Inference
Statistic
Parameter
What can we infer about a Population’s Parameters
based on a Sample’s Statistics?
1.12
Definitions…
A variable is some characteristic of a population or sample.
E.g. student grades.
Typically denoted with a capital letter: X, Y, Z…
The values of the variable are the range of possible values
for a variable.
E.g. student marks (0..100)
Data are the observed values of a variable.
E.g. student marks: {67, 74, 71, 83, 93, 55, 48}
1.13
Types of Data….
Quantitative data
Numbers representing counts or
measurements
Qualitative (or categorical or
attribute) data
Can be separated into different categories
that are distinguished by some nonnumeric
characteristics
Examples
Quantitative data
The number of students with blue eyes
Qualitative (or categorical or
attribute) data
The eye color of students
Types of Data (quantitative)
• Discrete variables assume values
that can be counted.
• Continuous variables can assume
all values between any two specific
values. They are obtained by
measuring.
Definitions
We further describe quantitative data by
distinguishing between discrete and
continuous data
Discrete
Quantitative
Data
Continuous
Definitions
Discrete
data result when the number of possible values is
either a finite number or a ‘countable’ number of
possible values
0, 1, 2, 3, . . .
Continuous
(numerical) data result from infinitely many possible
values that correspond to some continuous scale or
interval that covers a range of values without gaps,
interruptions, or jumps
2 3
Examples
Discrete
The number of eggs that hens lay; for
example, 3 eggs a day.
Continuous
The amounts of milk that cows produce; for
example, 2.343115 gallons a day.
Interval Scale
Researcher can specify rank ordering of
variables and distance between
Intervals are equal but no rational zero
point (example IQ scale, Fahrenheit
scale)
Data can be treated mathematically,
most statistical tests are possible
Ratio Scale
Highest level of measurement
Rational meaningful zero point
Absolute magnitude of variable (e.g.,
mgm/ml of glucose in urine)
Ideal for all statistical tests
Types of Data (Qualitative)
• The nominal level of measurement classifies data
into mutually exclusive (nonoverlapping),
exhausting categories in which no order or
ranking can be imposed on the data.
Nominal Data…
Nominal Data
• The values of nominal data are categories.
E.g. responses to questions about marital status, coded as:
Single = 1, Married = 2, Divorced = 3, Widowed = 4
Because the numbers are arbitrary arithmetic operations
don’t make any sense (e.g. does Widowed ÷ 2 = Married?!)
Nominal data are also called qualitative or categorical.
1.25
Nominal Measurement
Lowest Level
Sorting into categories
Numbers merely symbols--have no
quantitative significance
Assign equivalence or nonequivalence
Examples, gender, marital status, etc
Male / female smoker /nonsmoker
alive/dead
1 2
Rules of Nominal system
All of members of one category are
assigned same numbers
No two categories are assigned the
same number (mutual exclusivity)
Cannot treat the numbers
mathematically
Mode is the only measure of central
tendency
• The ordinal level of measurement
classifies data into categories that
can be ranked; precise differences
between the ranks do not exist.
Ordinal Data…
Ordinal Data appear to be categorical in nature, but their
values have an order; a ranking to them:
E.g. College course rating system:
poor = 1, fair = 2, good = 3, very good = 4, excellent = 5
While its still not meaningful to do arithmetic on this data
(e.g. does 2*fair = very good?!), we can say things like:
excellent > poor or fair < very good
That is, order is maintained no matter what numeric values
are assigned to each category.
1.30
The Ordinal Scale
Sorting variations on the basis of their
relative standing to each other
Attributes ordered according to some
criterion (e.g. best to worst)
Intervals are not necessarily equal
Should not treat mathematically,
frequencies and modes ok
Ordinal scale
4.5
3.5
2.5
1.5
0.5
0
level level level