IMS - Lecture Four
Density Curves
• Definition: A density curve is a smooth curve that represents the overall pattern of a distribution. It approximates the empirical
distribution of data.
• Properties:
• The total area under a density curve equals 1, representing the entirety of the distribution.
• A density curve can be skewed left, skewed right, or symmetric.
• Importance:
• Area under the curve between two points represents the proportion or probability of values lying within that range.
• The density curve helps visualize the overall distribution of a dataset.
The Normal Distribution
• Definition: A special type of density curve that is symmetrical and bell-shaped.
• Characteristics:
• Symmetry: The normal distribution is perfectly symmetric about its mean ( ).
• Parameters:
• (mean): Determines the center of the distribution.
• σ (standard deviation): Determines the spread (wider distributions have larger standard deviations).
• The mean, median, and mode are all equal in a normal distribution.
• Key Concept: Many natural phenomena follow a normal distribution (e.g., height, intelligence).
• Standard Normal Distribution:
• A normal distribution with a mean of 0 and a standard deviation of 1.
• Any normal distribution can be transformed into a standard normal distribution using z-scores.
Standard Scores (Z-Scores)
• Definition: A z-score represents how many standard deviations a value (x) is from the mean ( ) of the distribution.
X−μ
• Formula: =Z
σ
• X: the raw score.
• : the population mean.
• σ: the population standard deviation.
• Interpretation:
• A positive z-score means the value is above the mean.
• A negative z-score means the value is below the mean.
• A z-score of 0 means the value is exactly at the mean.
• Effects of Standardizing:
• Converts the distribution into the standard normal distribution N(0,1).
• Effects:
• The mean becomes 0.
• The standard deviation becomes 1.
• The shape of the distribution remains unchanged.
μ
μ
μ
μ
Determining Proportions Based on Scores
1. Formulate the Question: Clearly define the range or value of interest.
X−μ
2. Standardize the Value (Calculate Z-scores): =Z
σ
3. Estimate Proportions:
• Use the 68-95-99.7 rule:
• 68% of data falls within 1 standard deviation from the mean.
• 95% falls within 2 standard deviations.
• 99.7% falls within 3 standard deviations.
• Alternatively, use a z-table to find precise probabilities for z-scores.
4. Draw a Conclusion: Based on the z-score and the table, estimate the probability (proportion).
Determining Scores Based on Proportions
• Use the z-table in reverse:
• If given a proportion (probability), look up the corresponding z-score.
• Rearrange the z-score formula to solve for the original score: X = μ + Z · σ
• Example: If you need the score where 95% of data lies below it, find the Z-score corresponding to 0.95 (~1.645) and solve for X
using the mean and standard deviation.
Assessing Normality
Visual Methods:
• Histogram: Check if the shape resembles a bell curve.
• Q-Q Plot: Plots expected normal values against actual data points. If the points lie on a 45° line, the distribution is normal.
Numerical Methods:
• Skewness: Symmetric distributions have skewness close to 0. Positive skew has a long right tail, and negative skew has a long left
tail.
• Kurtosis: Measures the "tailedness" of the distribution. A normal distribution has a kurtosis near 0.
• Kolmogorov-Smirnov Test: A statistical test for normality. However, this test is sensitive to large samples