Unit-2 Numericals
Unit-2 Numericals
Linear regression is the most basic and commonly used predictive analysis. One variable is considered to
be an explanatory variable, and the other is considered to be a dependent variable. For example, a modeler
might want to relate the weights of individuals to their heights using a linear regression model.
There are several linear regression analyses available to the researcher.
Simple linear regression
𝑛∑𝑥𝑦 − (∑𝑥)(∑𝑦)
𝑏(𝑠𝑙𝑜𝑝𝑒) =
𝑛∑𝑥 2 − (∑𝑥)2
Where,
x and y are two variables on the regression line.
b = Slope of the line.
a = y-intercept of the line.
x = Values of the first data set.
y = Values of the second data set.
Solved Examples
Question: Find linear regression equation for the following two sets of data:
x 2 4 6 8
y 3 7 5 10
Solution:
Construct the following table:
x y x2 xy
2 3 4 6
4 7 16 28
6 5 36 30
8 10 64 80
= 20 = 25 = 120 = 144
𝑛∑𝑥𝑦−(∑𝑥)(∑𝑦)
𝑏= 𝑛∑𝑥 2 −(∑𝑥)2
=
b = 0.95
∑𝑦∑𝑥 2 –∑𝑥∑𝑥𝑦
𝑎= 𝑛(∑𝑥 2 )–(∑𝑥)2
a = 1.5
Linear regression is given by:
y = a + bx
y = 1.5 + 0.95 x
Linear Regression
Problems with Solutions
Linear regression and modelling problems are presented along with their solutions at the bottom of the
page. Also a linear regression calculator and grapher may be used to check answers and create more
opportunities for practice.
Review
If the plot of n pairs of data (x , y) for an experiment appear to indicate a "linear relationship" between y
and x, then the method of least squares may be used to write a linear relationship between x and y.
The least squares regression line is the line that minimizes the sum of the squares (d1 + d2 + d3 + d4) of
the vertical deviation from each data point to the line (see figure below as an example of 4 points).
Figure 1. Linear regression where the sum of vertical distances d1 + d2 + d3 + d4 between observed and
predicted (line and its equation) values is minimized.
The least square regression line for the set of n data points is given by the equation of a line in slope
intercept form:
y=ax+b
• Problem 1
• Problem 2
a) Find the least square regression line for the following set of data
b) Plot the given points and the regression line in the same rectangular system of axes.
• Problem 3
The values of y and their corresponding values of y are shown in the table below
x 0 1 2 3 4
y 2 3 5 4 6
• Problem 4
The sales of a company (in million dollars) for each year are shown in the table below.
x y xy x2
-2 -1 2 4
1 1 1 1
3 2 6 9
Σx = 2 Σy = 2 Σxy = 9 Σx2 = 14
2.
We now use the above formula to calculate a and b as follows
a = (nΣx y - ΣxΣy) / (nΣx2 - (Σx)2) = (3*9 - 2*2) / (3*14 - 22) = 23/38
b) We now graph the regression line given by y = a x + b and the given points.
3.
x Y xy x2
-1 0 0 1
0 2 0 0
1 4 4 1
2 5 10 4
Σx = 2 Σy = 11 Σx y = 14 Σx2 = 6
b) We now graph the regression line given by y = ax + b and the given points.
5.
x Y xy x2
0 2 0 0
1 3 3 1
2 5 10 4
3 4 12 9
4 6 24 16
Σx = 10 Σy = 20 Σx y = 49 Σx2 = 30
We now calculate a and b using the least square regression formulas for a and b.
a = (nΣx y - ΣxΣy) / (nΣx2 - (Σx)2) = (5*49 - 10*20) / (5*30 - 102) = 0.9
b) Now that we have the least square regression line y = 0.9 x + 2.2, substitute x by 10 to find the
value of the corresponding y.
y = 0.9 * 10 + 2.2 = 11.2
7. a) We first change the variable x into t such that t = x - 2005 and therefore t represents the
number of years after 2005. Using t instead of x makes the numbers smaller and therefore
manageable. The table of values becomes.
y (sales) 12 19 29 37 45
We now use the table to calculate a and b included in the least regression line formula.
t Y ty t2
0 12 0 0
1 19 19 1
2 29 58 4
3 37 111 9
4 45 180 16
Σx = 10 Σy = 142 Σxy = 368 Σx2 = 30
We now calculate a and b using the least square regression formulas for a and b.
a = (nΣt y - ΣtΣy) / (nΣt2 - (Σt)2) = (5*368 - 10*142) / (5*30 - 102) = 8.4
b = (1/n)(Σy - a Σx) = (1/5)(142 - 8.4*10) = 11.6
Example 9.9
Calculate the regression coefficient and obtain the lines of regression for the following data
Solution:
Regression coefficient of X on Y
(i) Regression equation of X on Y
= 0.929X+7.284
Example 9.10
Calculate the two regression equations of X on Y and Y on X from the data given below, taking deviations
from a actual means of X and Y.
Solution:
= –0.25 (20)+44.25
= –5+44.25
= 39.25 (when the price is Rs. 20, the likely demand is 39.25)
Example 9.11
Obtain regression equation of Y on X and estimate Y when X=55 from the following
Solution:
(i) Regression coefficients of Y on X
(ii) Regression equation of Y on X
Y–51.57 = 0.942(X–48.29 )
Y = 0.942X–45.49+51.57=0.942 #–45.49+51.57
Y = 0.942X+6.08
Y= 0.942(55)+6.08=57.89
Example 9.12
Find the means of X and Y variables and the coefficient of correlation between them from the following
two regression equations:
2Y–X–50 = 0
3Y–2X–10 = 0.
Solution:
We are given
We get Y = 90
We get X = 130
2Y = X+50
Example 9.13
Find the means of X and Y variables and the coefficient of correlation between them from the following
two regression equations:
4X–5Y+33 = 0
20X–9Y–107 = 0
Solution:
We are given
We get Y = 17
But this is not possible because both the regression coefficient are greater than
So our above assumption is wrong. Therefore treating equation (1) has regression equation of Y on X and
equation (2) has regression equation of X on Y . So we get
Example 9.16
For 5 pairs of observations the following results are obtained ∑X=15, ∑Y=25, ∑X2 =55, ∑Y2 =135,
∑XY=83 Find the equation of the lines of regression and estimate the value of X on the first line
when Y=12 and value of Y on the second line if X=8.
Solution:
Y–5 = 0.8(X–3)
= 0.8X+2.6
=9