``````````````````````````````````````````````````````````````````````````````````````````````Data for SPSS BCA
For each of the following problems, use SPSS to perform the following tasks:
a. Define the variable
b. State its type
c. Give a variable label or a value label where appropriate.
d. Enter the data
e. Save the file using the fine name given in parentheses.
1. Heights to nearest centimeters of 100 men are given below.(File name: heights.sav)
148 158 158 157 146 170 180 177 164 175
171 156 158 158 150 171 172 174 162 156
157 159 155 165 155 171 172 173 160 170
161 161 164 158 165 165 165 164 162 163
162 165 171 168 168 168 167 169 167 165
155 161 161 151 165 168 169 161 170 156
171 162 158 168 170 167 164 164 156 156
166 168 166 166 154 164 170 169 161 157
167 166 153 164 157 158 160 161 161 169
164 152 152 156 170 153 154 167 168 179
Case Processing Summary
Cases
Valid Missing Total
N Percent N Percent N Percent
height of 100 persons 100 100.0% 0 0.0% 100 100.0%
Tests of Normality
Kolmogorov-Smirnova Shapiro-Wilk
Statistic df Sig. Statistic df Sig.
height of 100 persons .086 100 .066 .990 100 .624
a. Lilliefors Significance Correction
height of 100 persons Stem-and-Leaf Plot
Frequency Stem & Leaf
2.00 14 . 68
8.00 15 . 01223344
21.00 15 . 555666666777788888889
23.00 16 . 00111111112222344444444
27.00 16 . 555555566667777788888889999
15.00 17 . 000000111112234
3.00 17 . 579
1.00 18 . 0
Stem width: 10
Each leaf: 1 case(s)
2. The following data gives the religion of 25 students in a class. (File name: religion.sav)
Hindu =H Buddhist (B) Christian (C) B H
Muslim =M H H B B
H M B H H
M B H H C
C B H B H
religion
Frequency Percent Valid Percent Cumulative
Percent
Valid Buddhist 8 32.0 32.0 32.0
christia 3 12.0 12.0 44.0
hindu 11 44.0 44.0 88.0
muslim 3 12.0 12.0 100.0
Total 25 100.0 100.0
3. The following are the monthly incomes (in Rs.000s) of a group of persons. (Filename: income.sav)
9.2 10.5 11.2 9.2 17.2 10.7 12.5 9.8 10.7 12.2
12.2 9.5 9.0 9.4 14.1 11.7 13.3 10.3 9.6 9.0
12.8 9.8 13.0 10.0 14.9 13.0 9.8 12.1 12.7 13.3
11.7 12.2 11.6 11.5 11.2 9.2 10.4 11.9 11.0 11.4
12.1 8.1 10.7 11.9 9.6 11.0 11.5 12.4 9.9 11.7
Descriptive Statistics
N Mean Std. Deviation Skewness Kurtosis
Statistic Statistic Statistic Statistic Std. Error Statistic Std. Error
income 50 11.2740 1.70304 .855 .337 1.769 .662
Valid N (listwise) 50
income
Frequency Percent Valid Percent Cumulative
Percent
Valid 8.10 1 2.0 2.0 2.0
9.00 2 4.0 4.0 6.0
9.20 3 6.0 6.0 12.0
9.40 1 2.0 2.0 14.0
9.50 1 2.0 2.0 16.0
9.60 2 4.0 4.0 20.0
9.80 3 6.0 6.0 26.0
9.90 1 2.0 2.0 28.0
10.00 1 2.0 2.0 30.0
10.30 1 2.0 2.0 32.0
10.40 1 2.0 2.0 34.0
10.50 1 2.0 2.0 36.0
10.70 3 6.0 6.0 42.0
11.00 2 4.0 4.0 46.0
11.20 2 4.0 4.0 50.0
11.40 1 2.0 2.0 52.0
11.50 2 4.0 4.0 56.0
11.60 1 2.0 2.0 58.0
11.70 3 6.0 6.0 64.0
11.90 2 4.0 4.0 68.0
12.10 2 4.0 4.0 72.0
12.20 3 6.0 6.0 78.0
12.40 1 2.0 2.0 80.0
12.50 1 2.0 2.0 82.0
12.70 1 2.0 2.0 84.0
12.80 1 2.0 2.0 86.0
13.00 2 4.0 4.0 90.0
13.30 2 4.0 4.0 94.0
14.10 1 2.0 2.0 96.0
14.90 1 2.0 2.0 98.0
17.20 1 2.0 2.0 100.0
Total 50 100.0 100.0
4. The data given in the following table indicate the time intervals in days on 21 different occasions when an
order was placed. (Filename: order.sav)
Occasion Interval Occasion Intervals
s in days s in days
1 55 12 76
2 52 13 52
3 31 14 37
4 53 15 13
5 79 16 24
6 24 17 20
7 29 18 20
8 293 19 30
9 101 20 28
10 22 21 39
11 287
Statistics
order
Valid 31
N
Missing 0
Mean 49.35
Median 24.00
Mode 20
Std. Deviation 67.811
Variance 4598.370
Skewness 3.145
Std. Error of Skewness .421
Kurtosis 9.654
Std. Error of Kurtosis .821
order
Frequency Percent Valid Percent Cumulative
Percent
Valid 12 1 3.2 3.2 3.2
13 2 6.5 6.5 9.7
14 1 3.2 3.2 12.9
15 1 3.2 3.2 16.1
16 1 3.2 3.2 19.4
17 1 3.2 3.2 22.6
18 1 3.2 3.2 25.8
19 1 3.2 3.2 29.0
20 3 9.7 9.7 38.7
21 1 3.2 3.2 41.9
22 1 3.2 3.2 45.2
24 2 6.5 6.5 51.6
28 1 3.2 3.2 54.8
29 1 3.2 3.2 58.1
30 1 3.2 3.2 61.3
31 1 3.2 3.2 64.5
37 1 3.2 3.2 67.7
39 1 3.2 3.2 71.0
52 2 6.5 6.5 77.4
53 1 3.2 3.2 80.6
55 1 3.2 3.2 83.9
76 1 3.2 3.2 87.1
79 1 3.2 3.2 90.3
101 1 3.2 3.2 93.5
287 1 3.2 3.2 96.8
293 1 3.2 3.2 100.0
Total 31 100.0 100.0
5. The total quantity of aluminum sheets packed during 51 months is given in the following table. (Filename:
aluminum.sav)
Months Total Quantity packed (in nearest tons)
2010 2011 2012 2013 2014
January 669 703 1003 801 749
February 620 670 828 687 707
March 718 717 622 629 657
April 772 760 836 722
May 859 715 928 859
June 787 930 818 806
July 721 898 697 895
August 842 877 639 954
September 746 814 605 711
October 696 395 511 564
November 733 474 810 516
December 771 969 994 735
Statistics
almunaium
Valid 51
N
Missing 0
Mean 747.82
Median 735.00
Std. Deviation 132.250
Variance 17490.028
Skewness -.280
Std. Error of Skewness .333
Kurtosis .191
Std. Error of Kurtosis .656
almunaium
Frequency Percent Valid Percent Cumulative
Percent
Valid 395 1 2.0 2.0 2.0
474 1 2.0 2.0 3.9
511 1 2.0 2.0 5.9
516 1 2.0 2.0 7.8
564 1 2.0 2.0 9.8
605 1 2.0 2.0 11.8
620 1 2.0 2.0 13.7
622 1 2.0 2.0 15.7
629 1 2.0 2.0 17.6
639 1 2.0 2.0 19.6
657 1 2.0 2.0 21.6
669 1 2.0 2.0 23.5
670 1 2.0 2.0 25.5
687 1 2.0 2.0 27.5
696 1 2.0 2.0 29.4
697 1 2.0 2.0 31.4
703 1 2.0 2.0 33.3
707 1 2.0 2.0 35.3
711 1 2.0 2.0 37.3
715 1 2.0 2.0 39.2
717 1 2.0 2.0 41.2
718 1 2.0 2.0 43.1
721 1 2.0 2.0 45.1
722 1 2.0 2.0 47.1
733 1 2.0 2.0 49.0
735 1 2.0 2.0 51.0
746 1 2.0 2.0 52.9
749 1 2.0 2.0 54.9
760 1 2.0 2.0 56.9
771 1 2.0 2.0 58.8
772 1 2.0 2.0 60.8
787 1 2.0 2.0 62.7
801 1 2.0 2.0 64.7
806 1 2.0 2.0 66.7
810 1 2.0 2.0 68.6
814 1 2.0 2.0 70.6
818 1 2.0 2.0 72.5
828 1 2.0 2.0 74.5
836 1 2.0 2.0 76.5
842 1 2.0 2.0 78.4
859 2 3.9 3.9 82.4
877 1 2.0 2.0 84.3
895 1 2.0 2.0 86.3
898 1 2.0 2.0 88.2
928 1 2.0 2.0 90.2
930 1 2.0 2.0 92.2
954 1 2.0 2.0 94.1
969 1 2.0 2.0 96.1
994 1 2.0 2.0 98.0
1003 1 2.0 2.0 100.0
Total 51 100.0 100.0
Line Graph
6. Discrete data
Number of children Frequency
0 7
1 19
2 38
3 52
4 40
5 28
6 11
7 5
Steam and leaf plot
For each of the following problems:
a. Draw a stem and leaf diagram.
b. Form a suitable frequency distribution.
c. Draw a histogram of the distribution.
8. The grade point averages of students in a statistics course ae as follows:
3.4 2.1 2.4 2.8 3.2 3.7 2.5 2.2 2.8 2.6
3.2 3.3 3.3 2.9 2.4 2.5 2.4 3.2 2.3 2.8
2.8 3.1 3.1 3.0 2.7 3.2 3.5 2.6 2.5 2.9
valueofexport
Frequency Percent Valid Cumulative
Percent Percent
Valid 3 1 7.1 7.1 7.1
4 1 7.1 7.1 14.3
5 1 7.1 7.1 21.4
6 1 7.1 7.1 28.6
6 1 7.1 7.1 35.7
7 1 7.1 7.1 42.9
8 1 7.1 7.1 50.0
11 1 7.1 7.1 57.1
14 1 7.1 7.1 64.3
15 1 7.1 7.1 71.4
31 1 7.1 7.1 78.6
51 1 7.1 7.1 85.7
60 1 7.1 7.1 92.9
221 1 7.1 7.1 100.0
Total 14 100.0 100.0
9. The number of accidents occurring during the last year in 50 factories of a State is given below:
12 14 46 17 7 7 19 6 27 4
14 5 11 7 33 2 9 5 4 8
4 10 8 9 2 37 10 8 22 9
13 10 9 11 12 14 12 15 18 11
6 13 20 12 24 11 31 6 12 3
For each of the following data, use SPSS to perform the following tasks:
a. Draw a bar diagram
b. Draw a pie diagram
c. Draw a line diagram
Statistics
accident
Valid 50
N
Missing 0
Mean 12.78
Median 11.00
Std. Deviation 9.117
Variance 83.114
Statistics
frequency
Valid 5 10. The number of times each of the four categories occurs is given below:
N
Missing
Category A 0 B C D E
Mean Frequency 14
26.60 43 27 16 33
Mode 14a
Std. Deviation 12.054
Variance 145.300
11. The percentage of types of vehicles on the roads of a city was recorded as
follows in a year.
Vehicle type Percentage of total
Private cars 24.5
Trucks 35.7
Buses 16.8
Motorcycles 12.9
Other 10.1
Statistics
vehicle
Valid 5
N
Missing 0
Mean 20.0000
Median 16.8000
Std. Deviation 10.31261
vehicle
Frequency Percent Valid Percent Cumulative
Percent
10.10 1 20.0 20.0 20.0
12.90 1 20.0 20.0 40.0
16.80 1 20.0 20.0 60.0
Valid
24.50 1 20.0 20.0 80.0
35.70 1 20.0 20.0 100.0
Total 5 100.0 100.0
Histrogram:
Pie chart:
Line graph:
Descriptive statistics:
12. The raw data displayed below are the electricity charges (in rupees) paid for the month of January 2020
obtain from the random sample of 50 users.
97 171 202 178 147 102 153 197 127 82
157 185 90 116 172 111 148 213 130 165
141 149 206 175 123 128 148 168 109 167
95 163 150 154 130 143 187 166 139 149
108 119 183 151 114 135 191 137 129 158
Calculate mean, median and mode, standard deviation, variance, skewness and kurtosis.
Frequency:
Statistics
electricity_charge
Valid 50
N
Missing 0
Mean 147.16
Median 148.50
Mode 130a
Std. Deviation 31.656
Variance 1002.096
Skewness .011
Std. Error of Skewness .337
Kurtosis -.544
Std. Error of Kurtosis .662
a. Multiple modes exist. The smallest
value is shown
13. Temperatures (in Celsius) of a Kathmandu city for 50 days are:
20.8 22.8 21.9 22.0 20.7 20.9 25.0 22.2 22.8 20.1
25.3 20.7 22.5 19.7 20.3 22.1 25.2 21.9 28.2 28.0
28.5 22.1 23.4 24.3 25.0 26.3 22.2 20.6 19.3 18.4
20.9 21.6 28.9 25.9 23.7 21.3 19.5 19.0 19.7 20.5
26.0 29.9 28.6 20.6 27.8 27.0 23.8 24.0 25.1 20.8
th th th th
Calculate all quartiles, 6 and 9 deciles, median, 25 and 80 percentile.
Quartile:
Statistics
temp
Valid 50
N
Missing 0
Median 22.2000
25 20.7000
Percentiles 50 22.2000
75 25.2250
Decile:
6th:
Statistics
temp
Valid 50
N
Missing 0
Median 22.2000
10 19.7000
20 20.6000
30 20.8300
40 21.9000
Percentiles 50 22.2000
60 23.5800
70 25.0000
80 25.9800
90 28.1800
9th:
Statistics
temp
Valid 50
N
Missing 0
Median 22.2000
11.11111111 19.7000
22.22222222 20.6333
33.33333333 20.9000
44.44444444 22.0667
Percentiles
55.55555556 22.8000
66.66666667 24.3000
77.77777778 25.7000
88.88888889 28.0667
Percentile:
Statistics
temp
Valid 50
N
Missing 0
Median 22.2000
25 20.7000
Percentiles
80 25.9800
Probability
14. The following table gives the survey data of 30 people interviewed at a shopping mall in Kathmandu. It gives
their smoking habit (smoker (S)/Non-smoker (N)) and their health condition (have cancer (C)/have heart
problem (H), Good health (G), together with their identification number.
Id Habit
Health Id Habit Health
condition condition
1 S C 16 N G
2 N H 17 S G
3 N G 18 S C
4 N G 19 S C
5 S G 20 N G
6 N C 21 N G
7 N G 22 N C
8 S C 23 S C
9 S C 24 N G
10 S H 25 N G
11 N C 26 S G
12 N G 27 S C
13 N G 28 N G
14 S C 29 N G
15 N H 30 N G
By using SPSS to form 2x2 cross classification table of Habit by health condition. If a person is selected
randomly from this sample of 30 people, evaluate the probabilities of the following events.
(a) The person is smoker.
(b) The person is a non-smoker.
(c) The person has cancer.
(d) The person has heart problem.
(e) The person has good health.
(f) The person is a smoker and has cancer.
(g) The person is a smoker and has heart problem.
(h) The person is a non-smoker and has cancer.
(i) The person is smoker given that he has cancer.
(j) The person is in good health given that he is a smoker.
Probability distribution:
15. A showroom record of the sales of a car brand shows that 40 percent of customers use cheques as a mode of
payment. It is expected that following one of their clearance sales advertisement on the TV, 5 customers will
purchase cars on the following day. Use SPSS
To construct the probability distribution of the number of customers who make their payment by
cheque.
To plot these probabilities in a histogram.
To find the following probabilities:
(a) P(at most 3 customers make their payment by cheque)
(b) P(exactly 3 customers make their payment by cheque)
(c) P(at least 3 customers make their payment by cheque)
(d) P(more than 4 customers make their payment by cheque)
(e) P(less than 3 customers make their payment by cheque)
(f) P(more than 2 and less than or equal to 5 customers make their payment by cheque)
Also, calculate mean, variance and the standard deviation of the number of customers making their payment
by cheque.
16. The marketing manager of a mail order company has noted that, on average, she receives 10 complaint calls
from customers during a 5-day working week. She has also noted that the calls occur at random. Use SPSS
To construct the probability distribution of the number of complaint calls in a single day for up to 15
complaints.
To plot these probabilities in a histogram.
To find the following probabilities
(a) P(at most 12 complaint calls in a week)
(b) P(exactly 9 complaint calls in a week)
(c) P(at least 8 complaint calls in a week)
(d) P(more than 2 complaint calls in a week)
(e) P(less than complaint calls in a week)
(f) P(more than 1 and less than or equal to 5 complaint calls in a week)
Also, calculate the mean, variance and the standard deviation of the number of complaints calls in a single
day.
Continuous probability distribution: Normal distribution
16. Ten thousand candidates appeared in a certain examination carrying a maximum of 100 marks. It was found
that the marks were normally distributed with a mean 39.5 and standard deviation 12.5. Determine
approximately the number of candidates who secured a first class for which the minimum of 60 marks.
17. The heights of two years old are normally distributed with a mean height of 81cm and a standard deviation
of 3.4 cm. Paediatricians regularly measures the heights of toddlers to determine whether there is a
problem. There may be a problem when the child is in the top or bottom 5% of heights.
(a) What is the probability that a two-year old child will be taller than 90cm?
(b) What is the probability that a two-year old child will be shorter than 85cm?
(c) What is the probability that a two-year old child is between 75 and 85cm tall?
(d) Determine the heights of two-year old children that could be a problem.
Correlation and Regression:
18. Find the coefficient of correlation from the following data:
Month Wholesale Index number for
Rice Wheat
Jan 410 400
Feb 405 350
March 410 365
April 455 415
May 490 420
June 510 420
July 490 430
Aug 475 470
Sept 465 505
Oct 450 530
Nov 470 525
Dec 505 545
19. Twelve secretaries at a university were asked to take a special three days intensive course to improve their
keyboarding skills. At the beginning and again at the end of the course, they were given a particular two-
page letter and asked to type it flawlessly. The data shown in the following table were recorded.
Typist Number of years of Improvement (words
experience, X per minute) Y
A 2 9
B 6 11
C 3 8
D 8 12
E 10 14
F 5 9
G 10 14
H 11 13
I 12 14
J 9 10
K 8 9
L 10 10
(a) Find the equation of the regression line.
(b) As a check of your calculations in part (a), plot the 12 points and graph the line.
(c) Does it appear that the secretaries experience is linearly related to their improvement?
ANOVA
20. Seven samples of individuals were selected randomly from three communities. The ages of the persons were
as tabulated:
Community A Community B Community C
16 65 45
15 43 30
25 77 22
30 90 66
39 82 47
20 69 33
16 73 50
Use SPSS to carry out the ANOVA and is there a significant difference in the ages?
21. You are given the following data:
(a) Test to determine whether the treatment means differ. (Use 0.05)
(b) Test to determine whether the block means differ. (Use 0.05)
(c) Perform a Tukey test, if you found a significant F.
Block Treatment
1 2 3
A 7 12 8
B 10 8 9
C 12 16 13
D 9 13 6
E 12 10 11
***