0% found this document useful (0 votes)
44 views

Unit-2 Numericals

Linear regression is a commonly used predictive analysis technique where one variable is considered explanatory and the other dependent. There are several types including simple, multiple, logistic, ordinal, and multinomial regression. Simple linear regression involves one dependent and one independent variable, while multiple linear regression has one dependent and two or more independent variables. The linear regression equation is represented by y = a + bx, where a is the y-intercept and b is the slope. The formulas to calculate a and b use the sum of x and y values, the sum of x squared, and the sum of the product of x and y. Several examples are provided to demonstrate calculating a and b from data sets and using the linear regression equation to estimate

Uploaded by

SHIKHA SHARMA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views

Unit-2 Numericals

Linear regression is a commonly used predictive analysis technique where one variable is considered explanatory and the other dependent. There are several types including simple, multiple, logistic, ordinal, and multinomial regression. Simple linear regression involves one dependent and one independent variable, while multiple linear regression has one dependent and two or more independent variables. The linear regression equation is represented by y = a + bx, where a is the y-intercept and b is the slope. The formulas to calculate a and b use the sum of x and y values, the sum of x squared, and the sum of the product of x and y. Several examples are provided to demonstrate calculating a and b from data sets and using the linear regression equation to estimate

Uploaded by

SHIKHA SHARMA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Unit-2 Linear Regression Numericals

Linear regression is the most basic and commonly used predictive analysis. One variable is considered to
be an explanatory variable, and the other is considered to be a dependent variable. For example, a modeler
might want to relate the weights of individuals to their heights using a linear regression model.
There are several linear regression analyses available to the researcher.
Simple linear regression

• One dependent variable (interval or ratio)


• One independent variable (interval or ratio or dichotomous)
Multiple linear regression

• One dependent variable (interval or ratio)


• Two or more independent variables (interval or ratio or dichotomous)
Logistic regression

• One dependent variable (binary)


• Two or more independent variable(s) (interval or ratio or dichotomous)
Ordinal regression

• One dependent variable (ordinal)


• One or more independent variable(s) (nominal or dichotomous)
Multinomial regression

• One dependent variable (nominal)


• One or more independent variable(s) (interval or ratio or dichotomous)
Discriminant analysis

• One dependent variable (nominal)


• One or more independent variable(s) (interval or ratio)
Formula for linear regression equation is given by:
𝑦 = 𝑎 + 𝑏𝑥
a and b are given by the following formulas:

𝑛∑𝑥𝑦 − (∑𝑥)(∑𝑦)
𝑏(𝑠𝑙𝑜𝑝𝑒) =
𝑛∑𝑥 2 − (∑𝑥)2
Where,
x and y are two variables on the regression line.
b = Slope of the line.
a = y-intercept of the line.
x = Values of the first data set.
y = Values of the second data set.

Solved Examples
Question: Find linear regression equation for the following two sets of data:

x 2 4 6 8

y 3 7 5 10
Solution:
Construct the following table:

x y x2 xy

2 3 4 6

4 7 16 28

6 5 36 30

8 10 64 80

= 20 = 25 = 120 = 144
𝑛∑𝑥𝑦−(∑𝑥)(∑𝑦)
𝑏= 𝑛∑𝑥 2 −(∑𝑥)2
=
b = 0.95
∑𝑦∑𝑥 2 –∑𝑥∑𝑥𝑦
𝑎= 𝑛(∑𝑥 2 )–(∑𝑥)2

a = 1.5
Linear regression is given by:
y = a + bx
y = 1.5 + 0.95 x
Linear Regression
Problems with Solutions

Linear regression and modelling problems are presented along with their solutions at the bottom of the
page. Also a linear regression calculator and grapher may be used to check answers and create more
opportunities for practice.

Review
If the plot of n pairs of data (x , y) for an experiment appear to indicate a "linear relationship" between y
and x, then the method of least squares may be used to write a linear relationship between x and y.
The least squares regression line is the line that minimizes the sum of the squares (d1 + d2 + d3 + d4) of
the vertical deviation from each data point to the line (see figure below as an example of 4 points).

Figure 1. Linear regression where the sum of vertical distances d1 + d2 + d3 + d4 between observed and
predicted (line and its equation) values is minimized.

The least square regression line for the set of n data points is given by the equation of a line in slope
intercept form:

y=ax+b

where a and b are given by


Figure 2. Formulas for the constants a and b included in the linear regression .

• Problem 1

Consider the following set of points: {(-2 , -1) , (1 , 1) , (3 , 2)}


a) Find the least square regression line for the given data points.
b) Plot the given points and the regression line in the same rectangular system of axes.

• Problem 2

a) Find the least square regression line for the following set of data

{(-1 , 0),(0 , 2),(1 , 4),(2 , 5)}

b) Plot the given points and the regression line in the same rectangular system of axes.

• Problem 3

The values of y and their corresponding values of y are shown in the table below

x 0 1 2 3 4

y 2 3 5 4 6

a) Find the least square regression line y = a x + b.


b) Estimate the value of y when x = 10.

• Problem 4

The sales of a company (in million dollars) for each year are shown in the table below.

x (year) 2005 2006 2007 2008 2009


y (sales) 12 19 29 37 45

a) Find the least square regression line y = a x + b.

Solutions to the Above Problems

1. a) Let us organize the data in a table.

x y xy x2

-2 -1 2 4

1 1 1 1

3 2 6 9

Σx = 2 Σy = 2 Σxy = 9 Σx2 = 14

2.
We now use the above formula to calculate a and b as follows
a = (nΣx y - ΣxΣy) / (nΣx2 - (Σx)2) = (3*9 - 2*2) / (3*14 - 22) = 23/38

b = (1/n)(Σy - a Σx) = (1/3)(2 - (23/38)*2) = 5/19

b) We now graph the regression line given by y = a x + b and the given points.

3.

Figure 3. Graph of linear regression in problem 1.

4. a) We use a table as follows

x Y xy x2
-1 0 0 1

0 2 0 0

1 4 4 1

2 5 10 4

Σx = 2 Σy = 11 Σx y = 14 Σx2 = 6

We now use the above formula to calculate a and b as follows


a = (nΣx y - ΣxΣy) / (nΣx2 - (Σx)2) = (4*14 - 2*11) / (4*6 - 22) = 17/10 = 1.7

b = (1/n)(Σy - a Σx) = (1/4)(11 - 1.7*2) = 1.9

b) We now graph the regression line given by y = ax + b and the given points.

5.

Figure 4. Graph of linear regression in problem 2.

6. a) We use a table to calculate a and b.

x Y xy x2
0 2 0 0

1 3 3 1

2 5 10 4

3 4 12 9

4 6 24 16

Σx = 10 Σy = 20 Σx y = 49 Σx2 = 30

We now calculate a and b using the least square regression formulas for a and b.
a = (nΣx y - ΣxΣy) / (nΣx2 - (Σx)2) = (5*49 - 10*20) / (5*30 - 102) = 0.9

b = (1/n)(Σy - a Σx) = (1/5)(20 - 0.9*10) = 2.2

b) Now that we have the least square regression line y = 0.9 x + 2.2, substitute x by 10 to find the
value of the corresponding y.
y = 0.9 * 10 + 2.2 = 11.2

7. a) We first change the variable x into t such that t = x - 2005 and therefore t represents the
number of years after 2005. Using t instead of x makes the numbers smaller and therefore
manageable. The table of values becomes.

t (years after 2005) 0 1 2 3 4

y (sales) 12 19 29 37 45

We now use the table to calculate a and b included in the least regression line formula.

t Y ty t2

0 12 0 0

1 19 19 1

2 29 58 4

3 37 111 9

4 45 180 16
Σx = 10 Σy = 142 Σxy = 368 Σx2 = 30

We now calculate a and b using the least square regression formulas for a and b.
a = (nΣt y - ΣtΣy) / (nΣt2 - (Σt)2) = (5*368 - 10*142) / (5*30 - 102) = 8.4
b = (1/n)(Σy - a Σx) = (1/5)(142 - 8.4*10) = 11.6

b) In 2012, t = 2012 - 2005 = 7


The estimated sales in 2012 are: y = 8.4 * 7 + 11.6 = 70.4 million dollars.

Example 9.9

Calculate the regression coefficient and obtain the lines of regression for the following data

Solution:

Regression coefficient of X on Y
(i) Regression equation of X on Y

(ii) Regression coefficient of Y on X

(iii) Regression equation of Y on X


Y = 0.929X–3.716+11

= 0.929X+7.284

The regression equation of Y on X is Y= 0.929X + 7.284

Example 9.10

Calculate the two regression equations of X on Y and Y on X from the data given below, taking deviations
from a actual means of X and Y.

Estimate the likely demand when the price is Rs.20.

Solution:

Calculation of Regression equation

(i) Regression equation of X on Y


(ii) Regression Equation of Y on X

When X is 20, Y will be

= –0.25 (20)+44.25

= –5+44.25

= 39.25 (when the price is Rs. 20, the likely demand is 39.25)

Example 9.11

Obtain regression equation of Y on X and estimate Y when X=55 from the following

Solution:
(i) Regression coefficients of Y on X
(ii) Regression equation of Y on X

Y–51.57 = 0.942(X–48.29 )

Y = 0.942X–45.49+51.57=0.942 #–45.49+51.57

Y = 0.942X+6.08

The regression equation of Y on X is Y= 0.942X+6.08 Estimation of Y when X= 55

Y= 0.942(55)+6.08=57.89

Example 9.12

Find the means of X and Y variables and the coefficient of correlation between them from the following
two regression equations:

2Y–X–50 = 0

3Y–2X–10 = 0.

Solution:

We are given

2Y–X–50 = 0 ... (1)

3Y–2X–10 = 0 ... (2)

Solving equation (1) and (2)

We get Y = 90

Putting the value of Y in equation (1)

We get X = 130

Calculating correlation coefficient

Let us assume equation (1) be the regression equation of Y on X

2Y = X+50
Example 9.13

Find the means of X and Y variables and the coefficient of correlation between them from the following
two regression equations:

4X–5Y+33 = 0

20X–9Y–107 = 0

Solution:

We are given

4X–5Y+33 = 0 ... (1)

20X–9Y–107 =0 ... (2)

Solving equation (1) and (2)

We get Y = 17

Putting the value of Y in equation (1)

Calculating correlation coefficient

Let us assume equation (1) be the regression equation of X on Y


Let us assume equation (2) be the regression equation of Y on X

But this is not possible because both the regression coefficient are greater than

So our above assumption is wrong. Therefore treating equation (1) has regression equation of Y on X and
equation (2) has regression equation of X on Y . So we get

Example 9.16

For 5 pairs of observations the following results are obtained ∑X=15, ∑Y=25, ∑X2 =55, ∑Y2 =135,
∑XY=83 Find the equation of the lines of regression and estimate the value of X on the first line
when Y=12 and value of Y on the second line if X=8.

Solution:
Y–5 = 0.8(X–3)

= 0.8X+2.6

When X=8 the value of Y is estimated as


= 0.8(8)+2.6

=9

You might also like