Chapter 1 - Introduction To Statistics
Chapter 1 - Introduction To Statistics
Data - observations (such as measurements, genders, survey responses) that have been
collected.
Statistics - a collection of methods for planning experiments, obtaining data, and then
organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions
based on the data.
Population - the complete collection of all elements (scores, people, measurements, and
so on) to be studied. The collection is complete in the sense that it includes all subjects to
be studied. (TOTALITY)
Sample - a sub-collection of elements drawn from a population. from this, the totality of a
population can be defined (PORTION OF THE POPULATION)
- Discrete - data result when the number of possible values is either a finite number
or a “countable” number of possible values. 0, 1, 2, 3, … (ex: The number of eggs
that hens lay.)
- Continuous - numerical data from infinitely many possible values that correspond
to some continuous scale that covers a range of values without gaps, interruptions,
or jumps. (ex: The amount of milk that a cow produces; e.g. 2.343115 gallons per
day.)
Qualitative (or categorical or attribute) data - can be separated into different categories
that are distinguished by some nonnumeric characteristics (ex: genders (male/female) or
professional athletes)
4 Levels of Measurement (another way to classify data is to use levels of measurement)
*The difference between interval and ratio scales comes from their ability to dip below
zero interval scales hold no true zero and represent values below zero. For example, you
can measure temperature below 0 degrees celsius, such as -10 degrees (thermometer).
Ratio variables, on the other hand, never fall below zero. Height and weight measure from
0 and above, but never fall below it. (ruler)
*If sample data are not collected in an appropriate way, the data may be so completely
useless that no amount of statistical tutoring can salvage them.
Randomness typically plays a critical role in determining which data to collect.
Observational Study
- Observing and measuring specific characteristics without attempting to modify
the subjects being studied.
- (ex: effects of nature to the well-being of respondents from the park)
Experimental Study
- Apply some treatment and then observe its effects on the subjects.
(subject/respondent is being manipulated and then observed)
Confounding
- Occurs in an experiment when the experimenter is not able to distinguish between
the effects of different factors
- (ex: smoking, age (independent variable; cause) ---- depression (dependent; effect)
Try to plan the experiment so confounding does not occur!)
● Sample size: Use a sample size that is large enough to see the true nature of any
effects and obtain that sample using an appropriate method, such as one based
on randomness
● Random sample: Members of the population are selected in such a way that each
individual member has an equal chance of being selected
● Simple random sample (of size n): Subjects selected in such a way that every
possible sample of the same size n has the same chance of being chosen
● Systematic sampling: Select some starting point and then select every K th
element in the population
● Cluster sampling: Divide the population into sections (or clusters); randomly select
some of those clusters; choose all members from selected cluster
● Sampling error: The difference between a sample result and the true population
result; such an error results from chance sample fluctuations
Inferential statistics: Use sample data to make inferences (or generalizations) about a
population
Lower Class Limits: The smallest numbers that can actually belong to different classes
Upper Class Limits: The largest numbers that can actually belong to different classes
Class Boundaries: The numbers used to separate classes, but without the gaps created by
class limits
Class Midpoints: Can be found by adding the lower class limit to the upper class limit and
dividing the sum by two
Class Width: The difference between 2 consecutive lower class limits or 2 consecutive
lower class boundaries