STATISTICAL ANALYSIS
USING COMPUTER
(A REVIEW)
by
BASIC CONCEPTS
Statistics: a set of methods, procedures and rules
for organizing, summarizing, and interpreting
information.
This is a general definition.
Later, a distinction between statistics and
parameters will be made.
Here, it would be better to speak of statistical
methods.
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
Use of Symbols in Statistics
Statisticians (and statistical books) use symbols as
shorthands for complex concepts and constructs.
Symbols are typically either Arabic or Greek letters.
For example:
µ (the Greek letter, mu) typically represents the
mean (arithmetic average) of a set of values.
σ (the lower-case Greek letter, sigma) typically
represents the standard deviation of a set of values.
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
Three Types of Statistical Methods:
Descriptive Statistics: methods used to summarize,
organize, and simplify data.
Exploratory Statistics: methods for carefully examining
data prior to using more complicated statistical
procedures.
Inferential Statistics: methods that allow us to make
generalizations about populations based on data
obtained from samples.
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
Population vs Sample
Population: all members of a particular group (e.g., all
freshman, all males over the age of 21, all of the HEIs
in Pangasinan).
Sample: a subgroup of a population that is usually
assumed to be representative of the population (e.g.,
10 freshman selected at random).
INFERENTIAL STATISTICS by Arlene Nisperos-Mendoza
BASIC CONCEPTS
Parameters and Statistics
Parameter: the value of a variable in a
population.
Statistic: the value of a variable in a sample.
Statistics are often used to estimate or draw
inferences about parameters.
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
Sampling error: the difference between a
sample statistic and its corresponding
population parameter.
The values of sample statistics vary from
sample to sample, even when all samples
are drawn from the same population.
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
Internal vs External Validity
Internal validity: concerned with whether
the methods and procedures used in a study
warrant the conclusions drawn from the
study.
External validity: concerned with the extent
to which the results obtained on a sample
generalize to the target population.
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
Statistical procedures are the tools of
research.
There are several types (or methods) of
research studies and the type of statistical
procedure used will often vary from one type
of research to another.
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
The experimental method is used when the
researchers wants to establish a cause and effect
relationship.
The researcher manipulates one variable (the
independent) variable, and
Observes (or measures) what happens to a
second variable (the dependent variable),
While Controlling for all other variables
(extraneous variables).
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
Types of measurement: nominal.
• Coarse level of measurement used for
identification purposes.
• Substitutes numbers for other categorical labels.
• No order of magnitude is implied.
• Examples: sex (male or female), student
classification (freshman, sophomore, junior,
senior), etc.
• Sometimes referred to a qualitative measurement.
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
Also called categorical or qualitative data (or variables).
Represent lowest level of measurement.
Classify individuals into one of two or more mutually
exclusive categories.
The categories are usually represented by numbers. Eg:
1 if Male, 2 if Female.
1 if Democrat, 2 if Republican, 3 if Independent, 4 if
Other.
The numbers DO NOT indicate more or less of an attribute.
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
In addition to classifying individuals, ordinal scales rank
individuals in terms of the degree to which they possess
measures characteristics of attributes.
Ordinal scales allow us to compare individuals in terms of who
has more (or) less of a characteristic or attribute.
Ordinal scales do NOT indicate HOW MUCH more or less.
• Eg.:
- Class rank
- Acrobatic competition judgments.
• What about grades or GPA?
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
Includes both interval and ratio level scales.
• Scale measurements yield equal intervals between
adjacent scale points.
• The difference between 5’-6” and 6’ is the same as the
difference between 3’ and 3’-6”.
• The difference between an SAT-V score of 435 and 445 is
the same as the difference between a score of 520 and
530.
• Most scores obtained form achievement tests, aptitude
tests, etc. are treated as scaled data.
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
Quantitative data (sometimes called measurement
data) results from some form of measurement.
- Typically involves the use of some type of
measuring instrument.
Qualitative data (also called categorical data or
frequency data)
- results from counting the occurrences of
variables.
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
Discrete and Continuous variables
• Variables can also be described in terms of the
types of values they can be assigned.
- Discrete variables are categorical. No values
between two adjacent values are permissible.
- Continuous variables can (theoretically) have
an infinite number of values.
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
Discrete and Continuous variables
• Variables can also be described in terms of the
types of values they can be assigned.
- Discrete variables are categorical. No values
between two adjacent values are permissible.
- Continuous variables can (theoretically) have
an infinite number of values.
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
VARIABLES
QUANTITATIVE QUALITATIVE
RATIO INTERVAL ORDINAL NOMINAL
Pulse rate 36o-38oC Social class Gender
Height Ethnicity
Types of Data
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
Hypothesis – prediction about a single population or
about the relationship between two or more populations.
Hypothesis testing - is a procedure in which sample data
are employed to evaluate a hypothesis.
Research Hypothesis – general statement of what a
researcher predicts.
Statistical Hypotheses – summarize the research
hypothesis with reference to the population parameter or
parameters under study. It is a conjecture concerning one
or more populations whose veracity can be established
using sample data.
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
A null hypothesis is a statement of the status quo, one of
no difference or no effect.
The null hypothesis refers to a specified value of the
population parameter (e.g., , ), not a sample statistic
(e.g., ).
An alternative hypothesis is one in which some
difference or effect is expected.
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
The critical region or rejection region is the set of values
of the test statistic for which we reject the null
hypothesis.
The acceptance region or region of nonrejection is the
set of values of the test statistic for which we do not
reject the null hypothesis.
These two regions are separated by the critical value of
the test statistic.
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
Illustration of regions of rejection and nonrejection in a two-tailed test
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
The Type I error is the error committed when
we decide to reject the null hypothesis
when in reality the null hypothesis is true.
The Type II error is the error committed when
we decide not to reject the null hypothesis
when in reality the null hypothesis is false.
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
Statistical Actual Situation
Decision H0 is true H0 is false
Do not reject H0 Correct decision Type II Error
Reject H0 Type I Error Correct decision
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
The level of significance, denoted by , is the
maximum probability of committing a type I
error that the researcher is willing to commit.
Common values are 0.10, 0.05, and 0.01.
The smaller the value of that we choose, the
lower the risk of committing a Type I error.
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
p-value or probability value is the
probability for a given statistical model that,
when the null hypothesis is true,
the statistical summary (such as the sample
mean difference between two compared
groups) would be the same as or more
extreme than the actual observed results.
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
We compare the p-value with the alpha to determine
whether the observed data are statistically significantly
different from the null hypothesis:
If the p-value is less than or equal to the alpha (p .05), then
we reject the null hypothesis, and we say the result is
statistically significant.
If the p-value is greater than alpha (p > .05), then we do not
reject the null hypothesis, and we say that the result is
statistically nonsignificant (n.s.).
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
Conventions*
P > 0.10 non-significant evidence against H0
0.05 < P 0.10 marginally significant evidence
0.01 < P 0.05 significant evidence against H0
P 0.01 highly significant evidence against H0
Examples
P =.27 non-significant evidence against H0
P =.01 highly significant evidence against H0
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
Let ≡ probability of erroneously rejecting H0
Set threshold (e.g., let = .10, .05, or whatever)
Reject H0 when p ≤
Retain H0 when p >
Example: Set = .10. Find p = 0.27 retain H0
Example: Set = .01. Find p = .001 reject H0
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
Formulate H0 and H1
Select Appropriate Test
Choose Level of Significance
Calculate Test Statistic TSCAL
Determine Prob Determine Critical
Assoc with Test Stat Value of Test Stat
TSCR
Determine if TSCR
Compare with Level falls into (Non)
of Significance, Rejection Region
Reject/Do not Reject H0
Draw Research Conclusion
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
Hypothesis Testing Using the Critical Value
Step 1. State the null hypothesis , Ho, and the alternative hypothesis, Ha.
Step 2. Choose the level of significance, .
Step 3. Determine the appropriate statistical technique and
corresponding test statistic to use.
Step 4. Set up the decision rule. Identify the critical value or values that will
separate the rejection and nonrejection regions.
Decision Rule: Reject Ho if the value of the test statistic falls in the
region of rejection.
Step 5. Collect the data and compute the value of the test statistic..
Step 6. Determine whether the value of the test statistic falls in the
rejection or the nonrejection region. Make the statistical decision.
Step 7. Express the statistical decision in terms of the problem.
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
Hypothesis Testing Using the Critical p-Value
Step 1. State the null hypothesis , Ho, and the alternative hypothesis,
Ha.
Step 2. Choose the level of significance, .
Step 3. Determine the appropriate statistical technique and
corresponding test statistic to use.
Step 4. Collect the data and compute the value of the test statistic..
Step 5. Compute for the p-value. Compare the p-value with the level
of significance, . Make the statistical decision.
Decision Rule: Reject Ho if the p-value is less than or equal to the
level of significance.
Step 6. Express the statistical decision in terms of the problem.
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza
BASIC CONCEPTS
Hypothesis Testing Using the Critical p-Value
Step 1. State the null hypothesis , Ho, and the alternative hypothesis,
Ha.
Step 2. Choose the level of significance, .
Step 3. Determine the appropriate statistical technique and
corresponding test statistic to use.
Step 4. Collect the data and compute the value of the test statistic..
Step 5. Compute for the p-value. Compare the p-value with the level
of significance, . Make the statistical decision.
Decision Rule: Reject Ho if the p-value is less than or equal to the
level of significance.
Step 6. Express the statistical decision in terms of the problem.
STATISTICAL ANALYSIS USING COMPUTER by Arlene Nisperos-Mendoza