Lecture 3
Lecture 3
• Range=difference between the high and low score (i.e., 3, 11, 3, 4, 7, 8 –>
3, 3, 4, 7, 8, 11 (scores in order) –> 11 (high) minus 3 (low) = 8
• Variance=average of the squared distance each score is from the mean
• (i.e., 3, 11, 3, 4, 7, 8)
• (The mean is 6 and we subtract 6 from each score and then square that
answer) –>
• 3 – 6 = -3 –> -3 X -3 = 9
• 11 – 6 = 5 –> 5 X 5 = 25
• 3 – 6 = -3 –> -3 X -3 = 9
• 4 – 6 = -2 –> -2 X -2 = 4
• 7 – 6 = 1 –> 1 X 1 = 1
• 8 – 6 = 2 –> 2 X 2 = 4
T score (transformed score — a.k.a. Z score) –> multiple the z score by 10 and add 50
(i.e., z score=-1.5 –> -1.5 X 10 = -15 –> -15 + 50 = 35)
The skew is the tail. If the tail (skew) is on the left (negative side), we have a negatively skewed distribution. That means
that more of the subjects scored on the high end (because most of the people are not in the tail where the low scores
are)..
If the skew (tail) is on the right (positive side), we have a positive skew. That means more people scored low (because most
of the people are not in the tail where the high scores are
Sometimes most of the scores are in the middle, we then have a leptokurtic distribution.
Sometimes the scores have a large spread without a lot of people in the middle, we then have a platykurtic
distribution.
If a set of scores does not form a normal distribution (skewed), then the characteristics of the normal curve do not
apply. For example, 68% of the scores would not fall within one standard deviation of the mean if the distribution were
negatively skewed.
Big Data Analytics, 14