Forecasting: Components of Time Series Analysis
Forecasting: Components of Time Series Analysis
COMPONENTS OF TIME SERIES ANALYSIS Depending on the kind of variability, the different components of time series analysis can be grouped into four categories: specific techniques of calculations. Secular Trend Component, Cyclical Component, Seasonal Component and Random / Irregular Component. Each component has its
A.
The tendency of the time series data to increase, decrease or stagnate over a long period of time is called Secular Trend. The trend that emerges over time is usually the result of the impact of long term factors that affect the dependent variable. Secular Trend can be divided into two broad groups: Linear Trend and Non-Linear or Curvilinear Trend. In the former case the functions highest power is unity and in the latter case the functions highest power is other than unity. Depending on the nature of the trend-function used, the long term secular trend can be measured by fitting either a linear trend or an exponential trend or a parabolic trend. Some of the methods of fitting the trend in times series are discussed below:
1.
In this case the trend is represented by a straight line vis--vis the equation: Y = a + bX The following steps are used to estimate the values of a (vertical intercept) and b (slope of the line):
The value of a is the distance on the vertical axis from where the straight line
originates.
For the value of b find the difference between the values of the first and last time
periods and then divide this difference by the number of time periods involved.
Illustration: Fit a free hand curve to the following data Year Yt 1998 10 1999 12 2000 15 2001 19 2002 20 2003 17 2004 14 2005 19
Solution: The value of a is 10 ( the Yt for the first year) The value of b is (19 10) / 8 = 1.125 Hence, the required linear equation is: Y = 10 + 1.125X Using this equation we can calculate the theoretical values of Y for various years (1st year = 1, 2nd year = 2, 3rd year = 3, ) and obtain the trend line (refer Fig 1).
Figure 1
2.
In this method, the total series of observations are subdivided into two parts. The average of each part is computed and placed against the middle period. Taking the two average points a curve is fitted, known as the semi-average curve. Further assuming that a linear function would adequately describe the data, a trend line can now be fitted (based on the function Y = a + bX). The constant components of the function (viz., a and b) are estimated as under:
The average of the first part is taken as the intercept (ie., a) The slope is determined with the help of the formula:
Illustration: Fit a trend curve to the following data by method of semi-averages (the data is given in columns 1 and 2 and the semi-averages are worked and displayed in column 3): Year 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 Annual Income (Rs lakhs) 1 1.5 1.9 1.9 1.95 2 2.1 2.22 2.31 2.42 2.5 2.55 2.1 3.11 3.25 Semi-Averages
1.76
2.61
1987 is taken as the year that divides the data into two equal halves (of 7 years each). Hence, the vertical intercept (a) would be 1.76 and the slope (b) would be estimated as: . Y = 1.76 + 0.106X
Note: If the data has an even number of observations that choose the mid-year value between the two mid-most years per half of the data. For instance, if the above database was from 1980 to 1991 (12 years in all), then the mean of the first half would be placed at 1982.5 and that of the second half would be placed at 1988.5)
3.
This method has the following sub-methods: Method of Moving Averages; Method of Weighted Moving Averages; Method of Semi-Averages (which is the same as discussed above) a. Method of Moving Averages: A moving average can be obtained by successively averaging overlapping groups of two or more consecutive values in a time series. Illustration : Find the four yearly moving average of the following data: Year 2004 2005 2006 2007 Solution: Year 2004 Quarter I II III IV 2005 I II III IV 2006 I II III 2007 IV I Sales 42 58 80 60 46 60 82 64 44 56 85 70 48 4-Qtly M.A.
(42+ 58+80+60)/4 = 60 (58+80+60+46)/4 = 61 (80+60+46+60)/4 = 61.5 (60+46+60+82)/4 = 62 (46+60+82+64)/4 = 63 (60+82+64+44)/4 = 62.5 (82+64+44+56)/4 = 61.5 (64+44+56+85)/4 = 62.25 (44+56+85+70)/4 = 63.75 (56+85+70+48)/4 = 64.75 (85+70+48+54)/4 = 64.25 (70+48+54+89)/4 = 65.25 (48+54+89+72)/4 = 65.75 (60+61)/2= 60.5 (61+61.5)/2= 61.25
I 42 46 44 48
II 58 60 56 54
III 80 82 85 89
IV 60 64 70 72
II III IV
54 89 72
65.5
Note: When an average of even order needs to be computed, each average should be placed at the centre of the selected data set. However, in this case the moving average would not correspond to a particular time period, making further analysis difficult. Hence, another set of centred moving averages is calculated by taking the average of two previously calculated moving averages and placing the value in-between the two previous averages. Such centering technique is not needed when the number of observations is odd.
b. Method of Weighted Moving Averages: In this case the moving average value assigns equal weightage to all values. It is also possible to assign different weights to different values (if the data needs such an adjustment). The weighted moving average (WMA) is calculated by using the following formula:
Illustration: The manager of a retail store wants to forecast sales of a particular brand of soap, which was recently introduced by a company. Sales data of the last 12 weeks is available to the manager:
1 2
2 3
3 4
4 3
5 4
6 6
7 8
8 10
9 12
10 11
11 13
12 14
The manager decides to assign the following weights to calculate the 3-weekly weighted average.
Weeks Weights
Last Week 4
2 weeks ago 3
3 weeks ago 2
Total 9
4.
A straight line trend is appropriate when the growth of a time series is relatively a constant amount. To fit the straight-line trend we apply the least square method for fitting the regression equation Y = a + bX (Note: X denotes the time variable and Y the observations at different points of time)
6
The two normal equations are : (i) .(ii) By solving the above two equations simultaneously, we obtain a and b and thereby get the trend line equation.
Illustration: Find the straight-line trend for the following data and then estimate the profit for the year 2008. Also draw the trend line Year Profit (Rs 00000) Solution: Year 2000 2001 2002 2003 2004 2005 2006
Totals
2000 30
2001 35
2002 40
2003 42
2004 45
2005 48
2006 50
X = Year - 2003 -3 -2 -1 0 1 2 3 0
X2 9 4 1 0 1 4 9 28
Trend value 32 35 38 41 44 48 51
and b = 3.25
Therefore the trend line equation is : Y = 41.43 + 3.25X We can use this equation and find the trend (theoretical) values of Y The estimated profit in 2008 will be 41.43 + 3.25(5) = 41.43 + 16.25 = 57.68 The trend line is plotted in the following graph
5.
Exponential trend is applicable where growth in time series data is nearly at a constant rate wrt per unit time. The exponential curve is given by the equation: Yt = abx
This exponential equation can be transformed into a linear equation by taking the logarithms of both sides: logYt = loga + X logb The normal equations for an exponential expression are: (i) .(ii) Solving these equations simultaneously we get: and Illustration: The index of industrial production in India from 1975 to 1985 are presented below. Find: the exponential trend equation and the estimated index for 1988
Year Index 1975 100 1976 115 1977 130 1978 137 1979 135 1980 130 1981 140 1982 148 1983 155 1984 162 1985 180
Solution:
Year 1975 1976 1977 1978 1979 1980 Index (Y) 100 115 130 137 135 130 X -5 -4 -3 -2 -1 0 Log Y 2.0000 2.0607 2.1139 2.1367 2.1303 2.1139 8 X (Log Y) -10 -8.2428 -6.3417 -4.2734 -2.1303 0 X2 25 16 9 4 1 0
1 2 3 4 5
1 4 9 16 25 110
and Therefore the exponential trend is : logYt = 2.14 + 0.02X The estimated index for 1988 is : logYt = 2.14 + 0.02(8) = 2.14 + 0.16 = 2.3 Therefore Yt = antilog 2.3 = 199.5
B.
Seasonal variation is that movement of a time series where the change occurs during a one year period or even less than a year. Detecting and measuring the seasonal variation of a time series data can be useful in the following ways: i. It can help in analyzing the behaviour of the series in the past. ii.It can be used for making projections for the future, based on examination of past patterns. iii. Once the seasonal variation has been calculated, we can eliminate this from the time series and determine the cyclical patterns in the data. This is known as deseasonalization. Some the methods of measuring the seasonal component of time series data are: Method of Simple Averages, Ratio-to-Trend Method or Percentage-to-Trend Method and Ratio-to-Moving Average Method.
1.
The steps have to be followed in this method: The total for each period (eg month) is computed The average of period (month) is calculated The average of the periodic averages is calculated The seasonal indices for each month are calculated with the help of the formula:
Illustration:
Calculate the seasonal indices for the following data, using the method of simple averages:
Months Jan Feb March April May June July Aug Sept Oct Nov Dec 2004 364 342 288 262 236 245 249 268 250 300 328 299 2005 394 367 345 309 284 279 251 269 287 320 328 367 2006 399 379 328 360 300 308 270 310 300 340 350 390 2007 347 325 302 270 247 230 220 230 250 279 280 310 Mthly Totals 1504 1413 1263 1201 1067 1062 990 1077 1087 1239 1286 1366 Mthly Averages 376 353.25 315.75 300.25 266.75 265.5 247.5 269.25 271.75 309.75 321.5 341.5 3638.8 Average of monthly averages = Seasonal Indices 124.00 116.50 104.13 99.02 87.97 87.56 81.62 88.79 89.62 102.15 106.03 112.62
2.
From the following data (columns 1, 2 & 3) do as directed: i. Estimate the trend equation
[Note this is nothing but getting the trend values (column 7) using the least squares method (through the normal equations)]
ii.Compute the quarterly seasonal indices using the ratio-to-trend method iii. Explain how an analyst can use these indices to set quarterly target schedules
Year 2004 Quarter (X) I (1) II (2) III (3) IV (4) I (5) II (6) III (7) IV (8) I (9) II (10) III (11) IV (12) Totals Sales (Y) 118 109 93 120 126 124 108 139 143 140 127 155 1502 X = Qtr - 6.5 -5.5 -4.5 -3.5 -2.5 -1.5 -0.5 0.5 1.5 2.5 3.5 4.5 5.5 0 X2 30.25 20.25 12.25 6.25 2.25 0.25 0.25 2.25 6.25 12.25 20.25 30.25 143 XY -649 -490.5 -325.5 -300 -189 -62 54 208.5 357.5 490 571.5 852.5 518 Y trend 128.8 132.4 136 139.6 143.2 146.8 150.4 154 157.6 161.2 164.8 168.4 125.2 % of Trend values 91.61 82.33 68.38 85.96 87.99 84.47 71.81 90.26 90.74 86.85 77.06 92.04
2005
2006
and
10
Steps to be followed in the ratio-to-trend method: 1. After estimating the trend equation (through the least square method), we obtain the trend values for each time unit. 2. Calculate each observed value in the series (column 3) for each time period as a percentage of the corresponding trend value (column 7) [Note: ]
3. This step ensures that the trend (secular) values have been eliminated from the time series. 4. In the next step, we determine if there is a seasonal effect in the time series. For this we need to examine Column 8. presence of seasonal effects: a. The period-by-period ratios are similar for some periods (eg the ratio of July for one year is similar to the ratio of July of other years), this indicates that there is a seasonal effect on display in the data. b. The period-by-period ratios are not similar (eg the ratio of July for one year are similar to the ratio of July of other years). If the ratios are the same for all period, there is no seasonal effects on display in the data. 5. Calculate the median or modified mean for each period (Note: this is done by discarding the highest and the lowest values of a given period and the median is calculated from the remaining data) Quarter I II III IV 2004 91.6 82.33 68.38 85.96 2005 87.99 84.47 71.81 90.26 2006 90.74 86.85 77.06 92.04 Median (Seasonal Index) 90.74 84.47 71.81 90.26 There are two indicators of the
(Note: i. Each Quarters cell value is from Column 8 of the first table of this method ii. The Median value is the mid-most value of each row)
6. The final step involves adjusting the seasonal indices in such a manner that the average should be 100. This is done by using the formula:
11
The final adjusted seasonal indices are as follows: Quarter I II III IV Implications:
Adjusted Seasonal Indices (90.74)(1.19) = 107.98 (84.47) (1.19) = 100.52 (71.81) (1.19) = 85.45 (90.26) (1.19) = 107.41
The 1st and 4th quarters show a positive seasonal variation. The 2nd quarter shows a very marginal positive effect The 3rd quarter shows a negative seasonal effect Based on these observations the executive decisions taken can be:
o increase targets during the 1st and 4th quarters; o keep target nearly the same in the 2nd quarter; and o have a lower target during the 3rd quarter.
3.
This method is similar to the ratio-to0trend method, the only difference being that in place of the trend values, moving averages are used and observed values are calculated as a percentage of the corresponding moving averages.
From the following data do as directed: Year I 2004 42 2005 46 2006 44 2007 48
II 58 60 56 54
III 80 82 85 89
IV 60 64 70 72
ii. Compute the quarterly seasonal indices using the ratio-to-moving average
method iii. Explain how an analyst can use these indices to set quarterly target schedules
Solution:
(Solution for Q.i is in the following table)
Year 2004
Quarter I II III IV
Sales 42 58 80 60 46 60 82 64 44 56 85 70 48 54 89 72
Centered 4-Qtly M.A.
4-Qtly M.A.
(42+ 58+80+60)/4 = 60 (58+80+60+46)/4 = 61 (80+60+46+60)/4 = 61.5 (60+46+60+82)/4 = 62 (46+60+82+64)/4 = 63 (60+82+64+44)/4 = 62.5 (82+64+44+56)/4 = 61.5 (64+44+56+85)/4 = 62.25 (44+56+85+70)/4 = 63.75 (56+85+70+48)/4 = 64.75 (85+70+48+54)/4 = 64.25 (70+48+54+89)/4 = 65.25 (48+54+89+72)/4 = 65.75 (60+61)/2= 60.5 (61+61.5)/2= 61.25
2005
I II III IV
2006
I II III IV
2007
I II III IV
Year 2004
Quarter I (1) II (2) III (3) IV (4) I (5) II (6) III (7) IV (8) I (9) II (10)
X= Qtr - 6.5
X2
XY
Y trend
% of Trend values
2005
2006
13
2007
754.625
24
191
1566.8
Steps to be followed in the ratio-to-moving average method: 1. After estimating the trend equation (through the least square method), we obtain the trend values for each time unit. 2. Calculate each observed value in the series (column 3) for each time period as a percentage of the corresponding trend value (column 7) [Note: ]
3. This step ensures that the trend (secular) values have been eliminated from the time series. 4. In the next step, we determine if there is a seasonal effect in the time series. For this we need to examine Column 8. There are two indicators of the presence of seasonal effects: a. The period-by-period ratios are similar for some periods (eg the ratio of July for one year is similar to the ratio of July of other years), this indicates that there is a seasonal effect on display in the data.
b. The period-by-period ratios are not similar (eg the ratio of July for one
year is similar to the ratio of July of other years). If the ratios are the same for all period, there is no seasonal effects on display in the data. 5. Calculate the median or modified mean for each period (Note: this is done by discarding the highest and the lowest values of a given period and the median is calculated from the remaining data)
(Solution for Q.iii is in the following table)
(Seasonal Index)
(Note: i. Each Quarters cell value is from Column 8 of the first table of this method ii. The Median value is the mid-most value of each row)
6. The final step involves adjusting the seasonal indices in such a manner that the average should be 100. This is done by using the formula:
The final adjusted seasonal indices are as follows: Quarter I II III IV Implications:
Adjusted Seasonal Indices (51.14)( 1.86) = 95.12 (48.76) (1.86) = 90.69 (59.99) (1.86) = 111.58 (54.96) (1.86) = 102.23
The 3rd quarter shows a pronounced positive seasonal variation. The 2nd quarter shows a negative effect The 1st and 4th quarters show a marginal positive seasonal effect Based on these observations the executive decisions taken can be:
o increase targets during the 3rd quarter; o progressively reduce targets in the 1st and 4th quarters; and o have a lower target during the 2nd quarter.
C.
Residual Method
15
The following data shows a high construction companys projects which were lined up for 12 years. Calculate the cyclical indices by the Residual Method
Year 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 No. of projects (Y) 15 19 22 21 19 20 21 23 28 27 25 29 X= Yr - 2000.5 X2 XY Y trend % of Trend values
Totals
269
-5.5 -4.5 -3.5 -2.5 -1.5 -0.5 0.5 1.5 2.5 3.5 4.5 5.5 0
30.25 20.25 12.25 6.25 2.25 0.25 0.25 2.25 6.25 12.25 20.25 30.25 143
-82.5 -85.5 -77 -52.5 -28.5 -10 10.5 34.5 70 94.5 112.5 159.5 145.5
16.79 17.81 18.83 19.85 20.87 21.89 22.91 23.93 24.95 25.97 26.99 28.01
89.34 106.68 116.83 105.79 91.04 91.37 91.66 96.11 112.22 103.97 92.63 103.53
Steps to be followed in the ratio-to-moving average method: 1. After estimating the trend equation (through the least square method), we obtain the trend values for each time unit.
2. Calculate each observed value in the series (column 2) for each time period
3. This step ensures that the trend (secular) values have been eliminated from the time series.
D.
Since irregular variations are completely random in nature, they are difficult to model and analyze. The usual way to analyze it is to consider it as the leftover component once the seasonal and cyclical components have been accounted for.
16
predictions so as to prepare and adapt their projects. For that purpose they use - or rely on experts who use - the best available data and what are considered the most objective methods and models, often relying on statistics and mathematics, to process them. But there is a risk that those analyses and forecasts get affected by overconfidence in numbers (numeracy bias) and in underlying rational assumptions. Actually, several flaws and obstacles, described below, might affect as well the data as the models and methods used to make projections. 2. Biased human and social reactions Consumers, producers, investors, borrowers, lenders, businesses, public institutions might react in unforeseen ways and be affected by behavioural biases. Faulty policy implications and recommendations can arise from unrealistic assumptions. A models foundational assumptions may give rise to a model that is not consistent with reality. Empirical validation is imperative, if theory has to have any confidence in the models predictive ability. 3. Non binary situations Hard core quantification might be inappropriate when the reasoning is about people and society, or complex systems with unknown / unmeasurable / irrelevant probabilities 4. Non linear evolutions Economic evolutions might be disrupted by percolations, bifurcations and other kinds of sudden jumps proper to dynamic systems. Such events disturb simple and linear extrapolation of past trends. 5. Overconfidence in historical probabilities and in mathematical laws Numbers and equations have the appearance of rationality, but they might have illusive traits. Historical statistical series might be too short and therefore miss dramatic rare events. Again, stochastic laws based on assumptions (eg., Gaussian assumptions) might not fully apply to the situation / relationship under consideration. Even more important, past data become irrelevant in fully new circumstances and situations in which uncertainty and not measurable risks are involved. 6. Herd-instincts among experts Experts tend to use similar equations, assumptions and data. This can bring about a superficial consensus and thus not reveal the true forces involved in the relationship. Again, models based on experts projections tend to give a precise number (for example about inflation rate, GDP growth rate, earnings per share, currency rate, stock index...). Such prevision that gives only single result ignores all other scenarios.
17
Given the above shortcomings of an economic model or forecast, it does not mean that economic models should be totally discarded. These shortcomings are only indicators that economic models should be flexible on the basis of scenarios that take into account various possible unforeseen circumstances.
18