Statistics Learner Notes
Statistics Learner Notes
• The sum of all the values(𝑥𝑥 ) • The middle value of an arranged • The value of the data set with the
of the data set dived by the data set. highest frequency/ most common
number of values (𝑛𝑛). • Divides data in two equal sets. value.
∑ 𝑥𝑥 • The position of 𝑄𝑄2 = (𝑛𝑛 + 1).
1
• 𝑥𝑥 = 2
𝑛𝑛 Arrange data in
Box
Whisker
Whisker
Drawing tips
Important deductions
Skewness influences the mean. The more skew the data, the less the mean can be used for
central tendency. It the data is skew to the left the mean is too low. If it is skew to the right the
mean will be too high. The best measure of central tendency to use is the median, as this gives a
better idea of what is happening to the central tendency of the data.
QUESTION 1
The lengths of 20 children are measured (in centimeters) and the results is recorded. The data collected
is shown in the table below.
127 128 129 130 131 133 134 134 135 136
137 138 139 140 141 142 142 143 144 145
1.3 Sketch a Box and Whisker diagram to represent the data. (2)
[9]
QUESTION 2
The intelligence quotient score (IQ) of a Grade 10 class is summarised in the table below.
IQ INTERVAL FREQUENCY
90 ≤ x < 100 4
100 ≤ x < 110 8
110 ≤ x < 120 7
120 ≤ x < 130 5
130 ≤ x < 140 4
140 ≤ x < 150 2
The table below shows the weight (to the nearest kilogram) of each of the 27 participants in a weight loss
program.
56 68 69 71 71 72 82 84 85
88 89 90 92 93 94 96 97 99
1.7 The person weighing 127 kg claims she weighs more than one standard deviation above the (3)
mean. Do you agree with this person? Use calculations to motivate your answer.
[14]
ANSWERBOOK
WEIGHTLOSS
IN 4 WEEKS (IN GRAM) FREQUENCY
7. Press SHIFT then press 1. In order to draw the regression line, substitute any
8. Press 5: Reg two x-values that lie between the minimum and
9. Press 1 then =: to find 𝐴𝐴 maximum x- values into the equation of the
10. Press SHIFT then press 1. regression line, plot the two points and then join
11. Press 5: Reg them up.
12. Press 2 then =: to find 𝐵𝐵
MONTHLY INCOME
9 000 13 500 15 000 16 500 17 000 20 000
(IN RANDS)
MONTHLY
2 000 3 000 3 500 5 200 5 500 6 000
REPAYMENT
(IN RANDS)
3.1 Determine the equation of the least squares regression line for the data. (3)
3.2 If a person earns R14 000 per month, predict the monthly repayment that the (2)
person
could make towards a motor vehicle.
3.3 Determine the correlation coefficient between the monthly income and the monthly (1)
repayment of a motor vehicle.
3.4 A person who earns R18 000 per month has to decide whether to spend R9 000 as a monthly
repayment of a motor vehicle, or not. If the above information is a true representation of the
population data, which of the following would the person most likely decide on:
A. Spend R9 000 per month because there is a very strong positive
correlationbetween the amount earned and the monthly repayment.
B. NOT to spend R9 000 per month because there is a very weak positive
correlation between the amount earned and the monthly repayment.
C. Spend R9 000 per month because the point (18 000 ; 9 000) lies very near to
theleast squares regression line.
D. NOT to spend R9 000 per month because the point (18 000 ; 9 000) lies very (2)
far from the least squares regression line.
0 < x ≤ 100 7
4.1 How many people paid R200 or less on their monthly cellphone contracts? (1)
4.2 Use the information above to show that a = 24 and b = 16. (5)
4.3 Write down the modal class for the data. (1)
4.4 On the grid provided in the ANSWER BOOK, draw an ogive (cumulative frequency (4)
graph) to represent the data.
4.5 Determine how many people paid more than R420 per month for their cellphone (4)
contracts.
5.1.1 Write down the total number of food items ordered from the menu dur- (1)
ing this hour
5.1.2 Write down the modal class of the data (1)
5.1.3 How long did it take to order the first 30 food items? (1)
5.1.4 How many food items were ordered in the last 15 minutes? (2)
5.1.5 Determine the 75th percentile for the data. (2)
5.1.6 Calculate the interquartile range of the data. (2)
5.2.1 Calculate:
(a) The mean of the data (2)
(b) The standard deviation of the data (2)
5.2.2 Mary also works part-time as a waitress at the same restaurant. Over the same
15-day period Mary collected the same mean amount in tips as Reggie, but her
standard deviation was R14.
Using the available information, comment on the:
(a) Total amount in tips that they EACH collected over the 15-day period. (1)
(b) Variation that EACH of them received in daily tips over this period. (1)
[15]
SCATTER PLOT
255
250
245
Average serve speed (in km/h)
240
235
230
225
220
215
210
205
1,8 1,85 1,9 1,95 2 2,05 2,1
Height of a player (in metres)
6.1 Write down the fastest average serve speed (in km/h) achieved in this tournament. (1)
6.2 Consider the following correlation coefficients:
A. r = 0,93 B. r = –0,42 C. r = 0,52
6.2.1 Which ONE of the given correlation coefficients best fits the plotted data? (1)
6.2.2 Use the scatter plot and least squares regression line to motivate your (1)
answer to QUESTION 6.2.1
6.3 What does the data suggest about the speed of a tennis serve (in km/h) and the height (1)
of a player (in metres)?
6.4 The equation of the regression line is given as 𝑦𝑦� = 27,07 + 𝑏𝑏𝑏𝑏 . Explain why, in this (1)
context, the least squares regression line CANNOT intersect the y-axis at (0 ; 27,07).
8.1 Calculate:
8.1.1 The mean of the data (2)
8.1.2 The interquartile range of the data (3)
8.2 The standard deviation of the times taken by the girls is 5,94. How many girls took (2)
longer than ONE standard deviation from the mean to name the colours?
8.3 Draw a box and whisker diagram to represent the data on the number line provided in (3)
the ANSWER BOOK.
8.4 The five-number summary of the times taken by a group of 23 boys in naming the
colours of the rectangles correctly is (15 ; 21 ; 23 ; 5 ; 26 ; 38).
8.4.1 Which of the two groups, girls or boys, had the lower median time to correctly (1)
name the colours of the rectangle?
8.4.2 The first three learners who named the colours of all 30 rectangles correctly in (2)
the shortest time will receive a prize. How many boys will be among these [13]
three prize winners? Motivate your answer.