Chapter 1-4 for mini exam 1Review Questions intermediate statistics.
Chapter 1
1. Stevens' Four Levels of Measurement:
o Nominal, Ordinal, Interval, Ratio.
o Ratio: Has a true zero, allows all mathematical operations, and comparisons
using ratios (e.g., weight).
2. Differences Between Nominal and Interval (or Ordinal and Ratio):
o Nominal vs. Interval: Nominal data categorizes without order; interval data has
equal intervals but no true zero.
o Ordinal vs. Ratio: Ordinal ranks data without consistent intervals; ratio has equal
intervals and a true zero.
3. Arithmetic with Ordinal and Ratio Variables:
o Ordinal: Only rank comparisons.
o Ratio: Allows all mathematical operations.
4. Rationale for Treating Ordinal as Interval:
o Assumes approximately equal intervals, enabling use of powerful statistics like
means.
5. Situations for Treating Ordinal as Interval:
o Likert scales, socioeconomic status.
6. Class of Variables at Any Stevens' Level:
o Discrete variables can fit different levels, depending on context.
7. Why the Mean is Preferred for Some Ordinal Variables:
o Mean provides more detailed central tendency for ordinal variables treated as
interval.
8. Qualitative vs. Quantitative Variables:
o Qualitative: Descriptive, non-numerical (e.g., gender).
o Quantitative: Numerical, measurable (e.g., height).
9. Discrete, Continuous, and Discrete-Continuous Variables:
o Discrete: Specific values (e.g., number of children).
o Continuous: Infinite values (e.g., height).
o Discrete-Continuous: Hybrid, behaves discretely but varies continuously (e.g.,
income).
10. Graphing Discrete, Continuous, and Discrete-Continuous Variables:
o Discrete: Bar charts.
o Continuous: Histograms, line graphs.
o Discrete-Continuous: Line charts, scatter plots.
11. Difference Between Bar Chart and Histogram:
o Bar Chart: For discrete data; bars separated.
o Histogram: For continuous data; bars touch, shows distribution.
12. Determining Level of Measurement:
o Based on whether the data is qualitative or quantitative, and if it has a
meaningful zero, ranking, or equal intervals.
Chapter 2
1. Rules for Central Tendency by Level of Measurement:
Chapter 1-4 for mini exam 1Review Questions intermediate statistics.
o Nominal: Mode.
o Ordinal: Median.
o Interval/Ratio: Mean (if data is symmetric).
2. Additional Considerations:
o Data distribution, sample size, analysis purpose, and scale type.
3. When to Use Mode and When It’s Misleading:
o Use Mode: For nominal data, bimodal distributions, or most frequent
occurrence.
o Misleading: If all values are similarly frequent.
4. Median and When to Use It:
o Median: Middle value in an ordered dataset.
o Use: Skewed data, ordinal data, outlier-prone data.
5. When the Median is Misleading:
o Misleading: Data clustered at one end.
o Solution: Use mean or trimmed mean.
6. When to Use Mean and When It’s Misleading:
o Use: Normally distributed data.
o Misleading: Extreme outliers or skewed data.
7. Alternatives to Mean for Small Ratio Data Samples:
o Use median, trimmed mean, or bootstrapping.
ChatGPT
1. When might we prefer an entropy measure over IQV or standard deviation?
o When measuring dispersion for categorical data with uneven distributions.
Entropy captures the uncertainty in distributions.
2. In the formula [N² − ∑ nj²] / [N² − N² / k], what do N, nj, and k stand for?
o N: Total number of cases.
o nj: Number of cases in category j.
o k: Number of categories.
o This formula measures dispersion for categorical data (like IQV).
3. In the formula −∑ pi[log₂(pi)], what does pi stand for? What will it tell us?
o pi: Proportion of cases in category i.
o It measures entropy, indicating the uncertainty or randomness in the
distribution.
4. Reply to "No measure of dispersion for nominal/ordinal variables":
o Measures like IQV or entropy assess the spread or diversity in categories,
providing meaningful insights even without distances between categories.
5. How are IQV and entropy measure complementary?
o Both assess dispersion but in different ways: IQV is simpler for comparing
diversity, while entropy measures the unpredictability of distributions.
6. Common measures of dispersion for ordinal variables:
o Interquartile Range (IQR) and percentiles. Percentiles are safer as they reflect the
data’s distribution without assuming equal intervals.
7. IDR of 21 and MAD of 10 for grades:
o IDR of 21: Grades are spread widely across the middle 50%.
Chapter 1-4 for mini exam 1Review Questions intermediate statistics.
o MAD of 10: On average, grades deviate by 10 points from the median.
8. Suggested measure of dispersion for ratio variables and when it is misleading:
o Standard deviation (SD) is suggested, but it can be misleading for skewed data
with outliers.
9. Formula for standard deviation:
o SD = √(Σ(Xi − X̄ )² / N).
o The square root ensures the units of dispersion match the original variable.
10. Meaning of symbols in SD formula:
Xi: Individual values.
X̄ : Mean of the data.
N: Number of observations.
11. Advantages of the standard deviation:
Accounts for all data points, directly relates to variance, and is sensitive to deviations
from the mean.
12. When to prefer IQR over standard deviation:
For skewed data or when dealing with outliers, as IQR focuses on the middle 50% of the
data.
13. Why researchers prefer SD over IQR/IDR/MAD for ordinal data:
SD allows for more advanced statistical analysis, even when treating ordinal data as
interval.
14. Mean and SD of a z-score:
Mean = 0, SD = 1.
Z-scores help standardize data and compare values across different distributions.
15. Percentage of observations within 1 or 2 SDs of the mean (normally distributed):
Within 1 SD: ~68%.
Within 2 SDs: ~95%.
16. Empirical rule:
For normal distributions: 68% of data within 1 SD, 95% within 2 SDs, and 99.7% within 3
SDs of the mean.
17. Percentage of observations within 2 SDs of the mean for a continuous unimodal
variable:
Approximately 95%.
1. Draw the distributions:
o Leptokurtic rightward skewed: High peak with a long tail to the right.
o Platykurtic symmetric: Flat, broad peak with symmetry.
o Bimodal: Two distinct peaks.
2. Kurtosis vs. Skewness:
o Kurtosis: Measures the "tailedness" or peak height of a distribution.
o Skewness: Measures the asymmetry of a distribution (left or right skew).
3. Level of measurement and kurtosis/skewness:
o Both require at least interval-level data to be meaningful, as they rely on numeric
distance between values.
4. For kurtosis to make sense:
Chapter 1-4 for mini exam 1Review Questions intermediate statistics.
o The distribution must have continuous, ordered categories (interval or ratio
data).
5. Difference between reverse J-shaped and uniform distribution; U-shaped and normal
distribution:
o Reverse J-shaped: Rapid decrease in frequencies after an initial high frequency.
o Uniform distribution: Equal frequencies across all categories.
o U-shaped: High frequencies at both extremes, low in the middle.
o Normal distribution: Symmetrical with a single peak in the middle.
6. Skewness of −.50 vs. 1.00 and why they may be misleading:
o Skewness values might not fully capture the visual shape, as distributions can
have more complexity. If meaningful, -0.50 indicates slight left skew, and 1.00
indicates moderate right skew.
7. Kurtosis of .50 vs. −2.50:
o .50: Mildly leptokurtic, indicating a higher peak and fatter tails.
o −2.50: Platykurtic, indicating a flatter distribution with lighter tails.
8. Moors' problem with kurtosis:
o Kurtosis measures focus too much on the tails of a distribution rather than the
central peak, leading to potential misinterpretation.