0% found this document useful (0 votes)
217 views9 pages

Sta 250 2022 Session 2

1. The document describes a final examination for a Fundamentals of Regression Analysis course taken by Anna and her team. It provides instructions for candidates taking the exam and lists questions that will be covered. 2. Question 1 asks students to plot and analyze scatter diagram data on life insurance and income. It also asks them to calculate and interpret the Pearson correlation coefficient for this data. 3. Question 2 covers simple linear regression, asking students to fit an equation to data on height and long jump distance. It also provides output from SPSS and asks students to interpret results from a regression analyzing diagnostic test scores and final grades.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
217 views9 pages

Sta 250 2022 Session 2

1. The document describes a final examination for a Fundamentals of Regression Analysis course taken by Anna and her team. It provides instructions for candidates taking the exam and lists questions that will be covered. 2. Question 1 asks students to plot and analyze scatter diagram data on life insurance and income. It also asks them to calculate and interpret the Pearson correlation coefficient for this data. 3. Question 2 covers simple linear regression, asking students to fit an equation to data on height and long jump distance. It also provides output from SPSS and asks students to interpret results from a regression analyzing diagnostic test scores and final grades.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

CONFIDENTIAL CS/AUG 2022/STA250

UNIVERSITI TEKNOLOGI MARA


FINAL EXAMINATION

COURSE : FUNDAMENTALS OF REGRESSION ANALYSIS


COURSE CODE : STA250
EXAMINATION : AUGUST 2022
TIME : 3 HOURS

INSTRUCTIONS TO CANDIDATES

1. This question paper consists of seven (7) questions.

2. Answer ALL questions in the Answer Booklet. Start each answer on a new page.

3. Do not bring any material into the examination room unless permission is given by the invigilator.

4. Please check to make sure that this examination pack consists of:

i. the Question Paper


ii. an Answer Booklet – provided by the Faculty
iii. a Statistical Table - provided by the Faculty

5. Answer ALL questions in English.

DO NOT TURN THIS PAGE UNTIL YOU ARE TOLD TO DO SO


This examination paper consists of 9 printed pages

© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL


CONFIDENTIAL 2 CS/AUG 2022/STA250/Set 1

Answer all questions

QUESTION 1
Anna and her team conducted a study to know how the amount of life insurance depends on
the income of persons. They collected the information on twelve persons. The following table
lists the annual incomes (in RM ‘000) and amounts of life insurance policies for these twelve
persons (in RM ‘000).

Annual income 35 84 36 48 71 86 42 47 79 66 34 80
Life insurance 75 500 100 150 350 550 120 180 380 300 70 400

a) Plot a scatter diagram for the above data. What conclusion can you make from the plot?
(5 marks)

b) Compute the Pearson correlation coefficient and interpret the value obtained.
(5 marks)

c) Test the significance of the correlation coefficient at a 95% confidence level.


(5 marks)

QUESTION 2
a) In a long jump sport, it is believed that the distance jump depends on the height of the
players. To prove this statement, a long jump coach selects a sample of ten players and their
height were measured. The information gathered is shown in the following table.

Player Height of players (in cm) Distance jumped (in cm)


1 180 223
2 175 221
3 163 190
4 169 208
5 178 218
6 170 211
7 164 198
8 183 225
9 176 210
10 166 205

Fit a simple linear regression equation for the above data using the least square method.
(5 Marks)

© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL


CONFIDENTIAL 3 CS/AUG 2022/STA250/Set 1

b) A statistics diagnostic test is given to all new students taking a Statistic Course at a college.
A lecturer is interested to study the relationship between the statistics diagnostic test score
and the mark scored in statistics subject by the student in their final examination. The data are
collected from thirty students and the following output from SPSS was obtained.

i) Write down the estimated regression equation of the model.


(2 Marks)

ii) Estimate the coefficient of determination and interpret the value obtained.
(3 Marks)

iii) Test the significance of the model. Use α = 0.05


(4 Marks)

iv) Does the statistics diagnostic test score affect the final grades? Test the variable at
a a 5% significance level.
(4 Marks)

v) Find the 95% confidence interval for the slope.


(2 Marks)

© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL


CONFIDENTIAL 4 CS/AUG 2022/STA250/Set 1

QUESTION 3
A researcher conducted a study to determine whether the weight loss (Y, in kilogram) of a
particular compound depends on the amount of time (X, in hour) the compound has been
exposed to air. A simple linear regression model has been performed, and the output are given
below.

Observation Weight Loss Amount of Time Residual Predicted


1 2.5 2 0.058 2.442
2 6.7 6 0.573 6.127
3 3.9 5 -1.306 5.206
4 4.1 4 -0.185 4.285
5 5.3 5 0.094 5.206
6 8.8 8 0.830 7.970
7 6.2 6 0.073 6.127
8 7.8 8 -0.170 7.970
9 4.3 5 -0.906 5.206
10 5.8 4 1.515 4.285
11 6.5 7 -0.548 7.048
12 6.1 6 -0.027 6.127

© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL


CONFIDENTIAL 5 CS/AUG 2022/STA250/Set 1

a) Based on the graph provided, justify whether the independence and normality
assumptions of error terms are violated or not.
(4 marks)

b) Plot the residuals against the predicted values. Hence, draw your conclusion.
(4 marks)

c) Perform the lack of fit test to determine the linearity of the regression function at a 5% level
of significance.
(12 marks)

QUESTION 4
The researcher conducted a study to predict the hours per week a husband spends on
housework. The information of twelve families included:

y = hours per week husband spends in house work


x1=no of children in the family
x2=wife’s years of education
x3= husband’s years of education

A multiple regression model was regressed between the variables and the results of the
analysis are given in the following tables.

© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL


CONFIDENTIAL 6 CS/AUG 2022/STA250/Set 1

Model Sum of Squares


Regression 14.274
1 Residual 2.393
Total 16.667

Coefficientsa

Unstandardized Standardized Collinearity


Model Coefficients Coefficients t Sig. Statistics

B Std. Error Beta Tolerance VIF


(Constant) 2.021 1.681 1.203 .263
X1 .367 .185 .348 1.984 .082 .584 1.711

1 X2 .271 .080 .491 3.386 .010 .853 1.173

X3 -.211 .081 -.425 -2.854 .032 .663 1.509

a. Dependent Variable: Y

a) Write down the estimated regression model.


(2 marks)

b) Is the regression model useful to predict Y. Use 5% significance level.


(4 marks)

c) Calculate the coefficient of determination. Interpret its meaning.


(2 marks)

d) Interpret the estimated value of β2.


(2 marks)

e) Find the 95% confidence interval for β2 and interpret.


(4 marks)

f) Using the p-value approach, identify whether the no of children in the family and
the husband’s years of education are significant factors in the model. Use 5% significance
level.
(6 marks)

© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL


CONFIDENTIAL 7 CS/AUG 2022/STA250/Set 1

g) Does this model have a multicollinearity problem? Give a reason to support your answer.
Suggest one approach to remedy the multicollinearity problem.
(3 marks)

QUESTION 5

Based on the information given below, answer the following questions.

Given n=20 SSTO=5429.54

Model Independent variables SSR


1 X1 2960.34
2 X2 3634.85
3 X3 3579.55
4 X1X2 5123.80
5 X1X3 5411.60
6 X2X3 3751.32
7 X1X2X3 5127.99

By performing forward selection method, the first variable enters the model at stage one of the
procedure is X2. Continue the procedure to obtain the most appropriate final model. Use 5%
significance level.
(6 marks)

QUESTION 6
A health researcher wants to predict VOmax, an indicator of fitness and health. The researcher
recruited 35 participants to perform VOmax test. The researcher’s goal is to predict VOmax
based on these four factors: age, weight, heart rate and gender (male=1). The spss output is
given below.

Standardized
Unstandardized Coefficients
Model Coefficients t
B Std. Error Beta

(Constant) 87.830 6.385 13.756


age -.165 0.063 -.176 -2.633

1 weight -.385 0.043 -.677 -8.877

Heart rate -.118 0.032 -.252 -3.667

gender 13.208 1.344 .748 9.824

a) It was claimed that male participants have more VOmax score compare to female. Prove
this claim. Use 5% significance level.
(4 marks)

© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL


CONFIDENTIAL 8 CS/AUG 2022/STA250/Set 1

b) The researcher wants to include the factor ‘type of fitness’ into the model. ‘Type of fitness’
categorize as running, swimming or cycling. Describe how the researcher would
incorporate the new variable in the regression model.
(2 marks)

QUESTION 7

Puan Nora, a real estate negotiator, believes that the price of a house would depend on the
size of the house (X1 – in square feet) and the tax imposed by the government (X2 – in
RM). She recorded the prices (in RM ‘00) of 117 houses sold by his agency for the last five
years. A second-order model with interaction was then obtained as shown below.

ANOVAa
Model Sum of Squares df Mean Square F Sig.
Regression 13187947.582 5 2637589.516 81.303 .000b
1 Residual 3600999.204 111 32441.434
Total 16788946.786 116
a. Dependent Variable: Y
b. Predictors: (Constant), X1X2, X12, X22, X1, X2

coefficientsa
Standardized
Unstandardized Coefficients
Model Coefficients t Sig.
B Std. Error Beta
(Constant) -197.718 159.256 -1.242 .217
X1 1.072 .183 1.476 5.855 .000
X2 -.406 .216 -.877 -1.884 .062
1
X12 .001 .000 -1.286 -4.107 .000
X22 -2.078E-5 .000 -.377 -2.304 .083
X1X2 .011 .000 1.578 3.513 .001
a. Dependent Variable: Y

a) State the estimated model obtained.


(2 marks)

b) Do the quadratic terms for size of the house effect the house’s price? Use the p-value
approach at a 5% significance level.
(3 marks)

© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL


CONFIDENTIAL 9 CS/AUG 2022/STA250/Set 1

c) Does the interaction between independent variables significantly effect a house’s


price? Test using the p-value approach at a 5% significance level.
(3 marks)

d) Write the final model based on the result obtained in b) and c).
(2 marks)

END OF QUESTION PAPER

© Hak Cipta Universiti Teknologi MARA CONFIDENTIAL

You might also like