Causal Forecasting Methods
Causal Forecasting Methods
Many factors can be considered in a causal analysis. For example, the sales of a
product may be related to the company's advertising budget, the
price, the prices of competitors and promotional strategies, or even the rates
economic and unemployment. In this case, sales would be called the dependent variable and
the other variables would be called independent variables. The role of the administrator is
to develop the best statistical relationship between sales and independent variables. The
The most common causal quantitative forecasting model is linear regression analysis.
The same mathematical model used in the least squares method can be employed.
squares to project the trend, when carrying out the linear regression analysis. The
The dependent variables that are desired to be forecasted will continue to be the y. But now the variable
independent, x, is not the time.
^y= a+bx
Where:
EXAMPLE 8
The owner of a company that builds offices, over time, the company has
realized that the volume of dollars from his renovation work is dependent on the
payroll in the Detroit area. The following table lists income and the amount of money
earned by wage workers in Detroit during the years 1988 - 1993.
2.0 1
3.0 3
2.5 4
2.0 2
2.0 1
3.5 7
The company's management wants to establish a mathematical relationship that will help them
predict sales. First, they need to determine if there is a straight-line (linear) relationship
between the payroll of the area and the sales, to print the known data in a diagram.
From the six data points, it can be seen that there is a slight positive relationship.
between the independent variable, the payroll, and the dependent variable; sales. While the
Payroll increases, the company's sales tend to be higher.
A mathematical equation can be found by using the least squares regression system.
squares.
2.0 1 1 2.0
3.0 3 9 9.0
2.5 4 16 9.0
2.0 2 4 4.0
2.0 1 1 2.0
3.5 7 49 24.5
Sum x 18
x́= = =3
6 6
Sum y 15
ý= = =2.5
6 6
∑ x y −n x́ ý 51.5− ( 6 ) ( 3 )(2.5)
b= 2 2 =
=0.25
Sum x−n ´x 80− ( 6 ) (3 )2
y=1.75+ 0.25x
^
O,
If the local chamber of commerce predicts that the payroll of the Detroit area where it is
the company will be $600 million, next year it is possible to estimate the sales of the company
with the regression equation:
Ventas (en cientos de miles) = 1.75+ 0.25 (6) = 1.75 + 1.50 = 3.25
$325,000
The final part of example 8 illustrates an important weakness in casual methods such as the
regression. Even when a regression equation has been calculated, it is necessary to provide a
forecast of the independent variable x (in this case, payroll) before estimating the variable
dependent and for the next period of time. Although this is not a problem for everyone.
the forecasts, it is presumed the difficulty in determining the future values of some
common independent variables (such as unemployment rates, gross national product,
price indices, and so on).
The forecast of $325,000 for the company's sales in example 8 is called a point.
estimation. The estimated point is actually the mean, or expected value, of a distribution.
of possible sales values. The following figure illustrates this concept.
To measure accuracy, Sxy. This is called the standard deviation of the regression. The equation
what follows is an equation similar to the one found in most books of
Statistics to calculate the standard deviation of an arithmetic mean:
√
∑( y− y c )2
S yx, =
n−2
Where:
The following equation may seem more complex, but it is actually an easier version.
from the previous equation. Any of the formulas yields the same result and can be
used when preparing the prediction inventories around the estimated point.
√
2
( of) y−a∑
Sum y−b∑xy
S yx,=
n−2
EXAMPLE 9
Calculate the standard error of the estimate for the company's data in example 8. The
2
y sum.
The only necessary number that is not available to solve Syx is ∑. A quick
y 39.5. Therefore:
reveals that ∑2=
√
2
( of) y−a∑
Sum y−b∑xy
S yx, =
n−2
√
39.5−1.75 15.0
( −0.25(51.5)
)
S yx, = 0.306 (in hundreds of thousands)
6−2
The regression equation is a way of expressing the nature of the relationship between two
variables. The equation shows how one of the variables relates to the value and the
changes in another variable.
Another way to assess the relationship between two variables is by calculating the coefficient of
correlation. This measure expresses the degree or strength of the linear relationship. Generally defined
Like r, the correlation coefficient can be any number between +1 and -1. The figure
The following illustrates how the difference between r values can appear.
To calculate r, almost the same information that was previously needed to calculate is used.
a and b for the regression line. The equation for r is:
n∑xy−∑x∑y
r=
(2.11)
√ [ n∑ x − (∑ ∑x) ][ n ∑y − ( ]
2 2
2 2
y)
EXAMPLE 10.
In example 8, we observe the relationship between the office building sales of the company.
Construction and payroll in Detroit. Now, to calculate the correlation coefficient for the
the data shown, it is only necessary to add one more column of calculations (for y^2) and
then apply the equation for r.
y x x2 xy y2
2 2
∑y=15.0 The sum of x equals∑x18 = 80 ∑xy=51.5 ∑y = 39.5
Although the correlation coefficient is the most commonly used measure to describe the relationship between
two variables, there is another measure. It is called the coefficient of determination. It is simply,
2 2
the square of the correlation coefficient, that is. The rvalue of it willralways be a number
2
Positive in the range of 0 <= <=r 1. The coefficient of determination is the percentage of
variation in the dependent variable (y) that is explained by the regression equation. In the
2
case of the company, the value of2ris 0.81 ( 0.901 0.811), and indicates that 81% of the
total variation by means through the regression equation.
Multiple regression is a practical extension of the observed model. It allows for the
construction of a model with some independent variables. For example, if the company
the Construction would like to include the average annual interest rates in its model
renewal sales forecast, the appropriate equation would be:
Where:
a = intersection-y
x1 y x2values of the two independent variables, area payroll and interest rates,
respectively.
The mathematics of multiple regression becomes something complex (and often assigned
to the computer), so the formulas for a, b1 and b2 are left to the statistics texts.
EXAMPLE 11
The new multiple correlation line for the Construction company, calculated through
Software on computer is:
^y=1.80+ 0.30x1−5.0x2
It is also found that the new correlation coefficient is 0.96, which implies inclusion.
from the variablex2the interest rates, and adds more strength to the linear relationship.
Now it is possible to estimate the company's sales if the values for the payroll are substituted.
next year and the interest rate. If Detroit's payroll will be $600 million and the rate of
interest will be 0.12 (12%), sales will be forecasted as:
Sales ($hundreds of thousands) = 1.80 + 0.30 (6) – 5.0 (0.12) = 1.8 + 1.8 – 0.6 = 3.00
$300,000