Forecasting Techniques
Correlation Regression Time Series
I have but one lamp by which my feet are guided, and that is the lamp of experience.I know of no way of judging the future but the past - Patrick Henry
Forecasting
Takes historical data and projects them into the future to predict the occurrence of uncertain events
Correlation
A statistical measure of the degree of association between two variables. Examples :
Family income and expenditure on luxury items Yield of a crop and quantity of fertilizer used Age and hours of TV viewing per day Advertising and sales
WHY ?
Whether there is a relationship between two
variables Is the relationship strong Can we use it for predictive uses
TYPES
POSITIVE AND NEGATIVE
LINEAR AND NON LINEAR
SIMPLE,PARTIAL AND MULTIPLE
Positive Correlation
. Increasing x
Increasing y Decreasing x Decreasing y 5 8 10 16 10 16 15 17 18 20 8 5 10 12 17 15 20 18
12 10
Negative Correlation
Increasing Decreasing Decreasing Increasing x y x y 5 20 17 2 8 18 15 7 10 16 10 9 15 12 8 13 17 10 5 14
Linear Correlation
Variations in the values of two variables have a constant ratio
X 10 20 30 40 50
70
140
210
280
350
Positive Linear Correlation
400
Plotted points fall on a straight line
350 300 250 200 150 100 50 0 10 20 30 40 50 Y
Curvilinear Correlation
When the amount of change in one variable does not bear a constant ratio to the change in the other variable
Simple, Partial Correlation
Simple : Two variables are studied. For eg.: yield of crop wrt fertilizer Partial : Two variables are studied, effect of other variables being constant. For eg. Yield of crop is affected by amount of fertilizer,rainfall, quality of seed etc.
Multiple Correlation
The relationship between more than three variables is studied simultaneously. For e.g.. Yield of a crop, amount of rainfall, quality of seed is studied simultaneously
Correlation Coefficient(r)
Correlation is a measure of linear association and not necessarily causation.
Just because two variables are highly correlated, it does not mean that one variable is the cause of the other.
Correlation Coefficient
The coefficient can take on values between -1 and +1.
Values near -1 indicate a strong negative linear relationship.
Values near +1 indicate a strong positive linear relationship.
Methods of Correlation Analysis
Scatter Diagram Karl Pearsons Coefficient of Correlation Spearmans Rank Correlation Method Method Of Least Squares
Scatter Diagram
High degree of Positive correlation
16 14 12 10 8 6 4 2 0 0 5 capital 10 15
profits
profits
Scatter Diagram
No Correlation
25 20 15 East 10 5 0 0 5 10 15 20
Merits
Simple
Not influenced by extreme values
First step for investigating relationships
Limitations Cannot know the exact degree of correlation
Karl Pearsons Coeff Of Corr (deviations from mean)
r = (X-X)(Y-Y) |(X-X) |(Y-Y) r = xy x=(X-X); |xy y=(Y-Y )
Karl Pearsons Coeff Of Corr (direct method)
r = NXY - XY |NX-(X) | NY-(Y)
Make a scatter plot and find Correlation coefficient between the sales and expenses from the data given
. Firm
Sales
6 7
10
50 50 55 60 65 65 65
60 60 50
Expen 11 13 14 16 16 15 15 14 13 13 ses
X = 580 = 58 ; Y = 140 = 14 10 10
. Firm
.
1 2 3 4 5 6 7 8 9 10
Sales X 50 50 55 60 65 65 65 60 60 50 580
X-X x
x 64 64 9 4 49 49 49 4 4 64 360
Expen Y-Y ses Y
y xy 9 1 0 4 4 1 1 0 1 1 22 +24 +8 0 +4 +14 +7 +7 0 -2 +8 70
-8 -8 -3 +2 +7 +7 +7 +2 +2 -8 0
11 13 14 16 16 15 15 14 13 13
y -3 -1 0 +2 +2 +1 +1 0 -1 -1 140
Calculation of r
r = xy where x=(X-X); y=(Y-Y) |xy r = 70 = 0.787 | 360 22
Properties
Lies between 1 and +1
Is independent of change of origin and scale
Is the GM of two regression coefficientd
r = |bxybyx If X and Y are independent variables, then r = 0
Q
Calculate the coefficient of correlation for the following data using Karl Pearsons method.
X
Y
46
39
52
41
63
52
59
68
73
60
70
64
87
76
86
80
r= +0.89
Correlation of Bivariate Grouped Data
r = Nfdxdy - fdxfdy |Nfdx-(fdx) | Nfdy-(fdy)
Find the coefficient of correlation between the
age and the sum assured from the following .
Age 10 20-30 4 sum assured(in 000s) 20 6 30 3 40 7 50 1
30-40 2
40-50 3 50-60 8
8
9 4
15
12 2
7
6
1
2
r = Nfdxdy - fdxfdy |Nfdx-(fdx) | Nfdy-(fdy)
r = 100(-7) (-33)(-61) |100(131)-(-33) | 100(131)-(-61) R = -2713 = -0.256 109.59 96.85 They are negatively correlated.
Probable Error
[Link] = 0.6745 1-r |N Interpretation If r< [Link] no correlation If r > 6 [Link] r is significant For the pop corr coeff = r + [Link] i.e the upper and lower limits within which it will lie
If r=0.6 and N=64, find out the probable error of the coeff of corr and the limits for the popn corr coeff [Link] = 0.6745 1-(0.6) 8 = 0.054 Limits of = (r + [Link]) and (r - [Link]) = (0.6 + 0.054) and (0.6 0.054) = 0.654 and 0.546
Limitations of r
Always assumes linear relationship
Interpretation to be handled carefully
Affected by extreme values
Time consuming
Rank Correlation Coefficient
R =1 - 6D = 1 - 6D N(N-1) N-N where D= difference of ranks between paired ranks R = rank correlation coefficient
R (actual ranks)
Two managers are asked to rank a group of employees in order of potential for eventually top managers. The rankings are as follows : Employees R1
A B C D E F G H I J
10 2 1 4 3 6 5 8 7 9
R2 9 4 2 3 1 5 6 8 7 10
Employees
A . B C D E F G H I J N=10
R1 10 2 1 4 3 6 5 8 7 9
R2 9 4 2 3 1 5 6 8 7 10
(R1-R2) 1 4 1 1 4 1 1 0 0 1 14
Rank Correlation Coefficient
R =1 - 6D = 1 - 6D N(N-1) N-N = 1- 614 = 0.915 990
R (ranks not given)
Calculate R for the following data of marks of 2 tests given to candidates for a clerical test
Prelimin 92 89 87 86 83 ary test Final 86 83 91 77 68 test 77 71 63 53 50
85 52 82 37 57
.Prelim test
X
92 89 87 86 83 77 71 63 53 50 N = 10
R 10 9 8 7 6 5 4 3 2 1
Final Test 86 83 91 77 68 85 52 82 37 57
R2 9 7 10 5 4 8 2 6 1 3
(R1- R2) 1 4 4 4 4 9 4 9 1 4 44
R (ranks not given)
R =1 - 6D N(N-1) = 1- 644 990 = 1 - 6D N-N = 0.733
Equal Ranks
Where equal ranks are assigned to some entries, an adjustment in the formula for calculating the rank correlation coefficient is made. Add 1(m - m) to the value of D where m 12 stands for the no. of of items whose ranks are equal
An examination of eight applicants for a clerical post was taken by a firm. From the marks obtained by the applicants in the accountancy and stats papers, compute R.
Appli A cant Accou 15 ntancy Statist 40 ics B 20 30 C 28 50 D 12 30 E 40 20 F G H
60 20 80 10 30 60
Appli Account R1 cants ancy X . A B C D E F G H N= 8 15 20 28 12 40 60 20 80 2 3.5 5 1 6 7 3.5 8
Marks in stats Y 40 30 50 30 20 10 30 60
R2 6 4 7 4 2 1 4 8
(R1- R2) = D 16.00 0.25 4.00 9.00 16.00 36.00 0.25 0.00 81.5
. = 1 6 {D+1(m1 - m1)+1(m2- m2)} R 12 12 N-N R = 1 6 {81.5+1(2 - 2)+1(3- 3)} 12 12 8-8 = 0 THERE IS NO CORRELATION
Q
In the list of 500 companies published in ET
the following are the ranks of the top ten companies according to their - overall rating - market capitalisation - net profit
Calculate the rank correlation coeff
between - overall rank and market cap rank - Overall rank and rank as per net profit - Market cap and rank as per net profit
Name
Over all Rank in February 2006
Infosys TCS WIPRO Bharti Hero Honda ITC Satyam HDFC Tata Motors Siemens
Rank as per Rank as per Market cap Net profit (Rank within (Rank within these these 10 companies) 10 companies) 1 1 2 2 2 3 3 4 5 4 3 4 5 5 1 6 8 8 7 9 9 8 6 7 9 7 6 10 10 10
Solution
(i) 0.8909
(ii)0.7576
(iii) 0.8667
Q
A panel consisting of two members
interviewed 10 candidates for the post of student president. Following are the ranks assigned . Is there a consensus among the two panel members ?
Ranks assigned by panel member 1
Ranks assigned by panel member 2
7 9 2 4 5 5 8 10 3 1
8 10 4 6 4 4 7 9 1 2
R1
R2
Adjusted R1 Adjusted R2
D2
7 9 2 4 5 5 8 10 3 1
8 10 4
7 9 2
8 10 4
-1 -1 -2
1 1 4
6
4 4 7 9 1 2
4
5.5 5.5 8 10 3 1
6
4 4 7 9 1 2
-2
1.5 1.5 1 1 2 1
4
2.25 2.25 1 1 4 1
Rank 5 occurs twice and rank 4 occurs thrice
R = 1 6
21.5 + (23- 2 ) + (33 3) 12 12 10 (102 1)
= 0.864 There is a consensus.
Covariance
Variation shows the extent of deviation of a
data set from its mean Variance in two data sets from their respective means is known as covariance When both X and Y variables are expressed in different units, it is difficult to measure r is a better measure
REGRESSION
The statistical tool with the help of which we are in a position to estimate (or predict) the unknown values of one variable from known values of another variable is called regression
Why
Provides values of dependent variable from
independent variable Obtain measure of error Measure correlation
The Linear Bivariate Regression Model
The value of the dependent variable Y is
dependent to some degree on the independent variable X. Basis of CAPM ( capital asset pricing model ) finance spec
Y = a + bX
X is ind var; Y is dep
70
var When X = 0, Y = a; a is called Y intercept Slope of the line is measured by b, average amount of change in Y for one unit change in X.
60
50
40
10 20 30 40 50 60
30
20
10
0 10 20 30 40 50 60
Regression Equation Of Y on X
Y= a + b X Normal Equations Y=Na + bX XY=aX +bX
Regression Equation Of X on Y
X= a + b Y Normal Equations X=Na + bY XY=aY +bY
Calculate the regression equation of X on Y and Yon X from the following data
X Y 1 2 2 5 3 3 4 8 5 7
XY
1 2 3 4
2 5 3 8
1 4 9 16
4 25 9 64
2 10 9 32
5
15
7
25
25
55
49
151
35
88
. Regression equation of X on Y is :
X = a + bY The normal equations are : X=Na + bY XY=aY +bY 15=5a + 25b 88=25a + 151b Solving simultaneously, a=0.5 and b=0.5 X = 0.5 + 0.5 Y
Y= a + b X Normal Equations Y=Na + bX XY=aX +bX Substituting 25=5a+15b 88=15a+55b Solving, a= 1.10 and b=1.3 Y = 1.10 + 1.30X
Deviations from Means
Y= a + b X byx=xy Y-Y= byx (X-X) x When X= a+bY X-X=bxy (Y-Y) bxy=xy y Where x=X-X and y=Y-Y
In the following table are recorded data showing test scores made by salesmen on an intelligence test and their weekly sales Calculate the regression eqn and estimate the probable weekly sales volume if a salesman makes a score of 100
Sales 1 2 3 4 5 men Test 40 70 50 60 80 score Sales 2.5 6 4 5 4
000Rs
6 50
10
90 40 60 60 4.5 3
2.5 5.5 3
Let sales be denoted by Y and test scores by X. We have to fit a regression equation of Y on X. Y-Y= byx (X-X) sales Testscore X-X x men X x 1 40 -20 400 2 70 +10 100 3 50 -10 100 4 60 0 0 5 80 +20 400 6 50 -10 100 7 90 +30 900 8 40 -20 400 9 60 0 0 10 60 0 0 N=10 600 0 2400 Sales Y 2.5 6.0 4.0 5.0 4.0 2.5 5.5 3.0 4.5 3.0 40 Y-Y y -1.5 +2.0 0 1.0 0 -1.5 +1.5 -1.0 +0.5 -1.0 0 y 2.25 4.0 0 1.00 0 2.25 2.25 1.00 0.25 1.00 14 xy +30 +20 0 0 0 +15 +45 +20 0 0 130
X=60; Y=4
byx=xy = 130 = 0.054 x 2400 Y-Y= byx (X-X) Y-4= 0.054(X-60) Y=0.76+0.054X When X is 100,Y=0.76 +0.054(100) = 6.16
Regression equation of Y on X Y-Y= byx (X-X) Regression Equation of X on Y X-X=bxy (Y-Y)
A company wants to assess the impact of R&D expenditure on its annual profit. The foll table presents the info for the last 8 years.
Years
expend Annual 45
2008 2007 2006 2005 2004 2003 2002 2001 7 42 5 41 10 60 4 30 5 34 3 25 2 20
R&D 9
Profit (000)
Estimate the regression equation and predict the annual profit for an allocated sum of Rs.100,000 as R & D expenditure
Let R&D expenditure be denoted by X and annual profit by Y. X = 5.625
Y = 37.125
Year X 2001 2002 2003 2004 2005 2006 2007 2008 2 3 5 4 10 5 7 9 45
X-6 dx -4 -3 -1 -2 +4 -1 +1 +3 -3
dx 16 9 1 4 16 1 1 9 57
Y 20 25 34 30 60 41 42 45 297
Y-37 dy -17 -12 -3 -7 +23 +4 +5 +8 1
dy 289 144 9 49 529 16 25 64 1125
dxdy +68 +36 +3 +14 +92 -4 +5 +24 238
Reg eq of Y on X ;
Y-Y= byx (X-X)
Y= 13.129 +4.266 X When X is 100; Y= 13.129 +4.266 (100) Y = 439.729
Regression Coefficients
Reg coeff of X on Y
bxy=r x
y bxy = Ndxdy-dxdy Ndy - (dy)
bxy=xy y
Reg coeff of Y on X
byx=r y x byx = Ndxdy-dxdy Ndx - (dx)
byx=xy x
Properties
The coeff of correlation is the GM of the
two reg coeffs i.e r = byx bxy If one of the coeffs is more than unity ,the other should be less than unity Both the coeffs will have the same sign Reg and corr coeff will have the same sign The average value of the two reg coeffs would be greater than r.
.
Reg coeffs are independent of change of
origin but not scale. X A = change of origin X A = change of scale i
Bivariate Grouped Frequency Distributions
bxy = Nfdxdy - fdxfdy h Nfdy - (fdy) k h = width of class interval of the X variable k= width of class interval of the Y variable byx = Nfdxdy - fdxfdy k Nfdx - (fdx) h
Obtain the two regression equations from the following bivariate frequency distns
Sales rev Advertising Expend in Rs. Thousand) (in Rs. Lakhs) 75-125 125-175 175-225 225-275 5-15 15-25 25-35 35-45
3
8 2 3
4
6 2 3
4
5 3 2
8
7 4 2
Estimate the 1. Sales corresponding to advertising expenditure of Rs. 50,000 2. The advertising expenditure for a sales revenue of Rs. 300 lakhs 3. The coefficient of correlation and interpret its value
Regression equation of X on Y:
X-X = bxy(Y- Y) X = A + fdx h = 150 + 22 50 =159.09 N 66 Y = A + fdy k = 30 + -26 10 = 26.06 N 66
bxy = Nfdxdy - fdxfdy h Nfdy - (fdy) k
bxy = 66(-14) (12)(-26) 50 66(100) (-26) 10 = - 0.5165 X- 159.09 = -0.5165(Y-26.06) X = 172.55 0.5165Y
Regression Eqn of Y on X
Y-Y= byx (X-X) byx = Nfdxdy-fdxfdy k Nfdx - (fdx) h byx = 66(-14) - (12)(-26) = -0.0273 66(70) (12) Y- 26.06 = -0.0273( X-159.09) Y= 30.40-0.0273 X
1. X = 172.55 0.5165(50)
= 146.725 2. Y = 30.40 0.0273(300) = 22.21 3. r =byx bxy = 0.51650.0273 = -0.119
Coefficient Of Determination
R is the proportion of the variation in the dependent variable Y explained by regression on the variable X Where Y = a + bX
Coeff of Determination r<=1
When r = 0.9, r = 0.81 81 % of the variation in the dependent variable is explained by the independent variable. Proportion of the variation in Y explained by X Ratio of explained variance to total variance.
Covariance
Variation shows the extent of deviation of a
data set from its mean Variance in two data sets from their respective means is known as covariance When both X and Y variables are expressed in different units, it is difficult to measure r is a better measure
of a stock / share
Reflects the sensitivity of a stock to
movement in the stock market index like Bombay Stock Exchange SENSEX, NSE NIFTY value for the market index is taken as one Stock with value equal to 0.8 would rise by 80 % of the rise in market
BSE July-2012 to June-2013
ode
Company
Co-efficient Beta of Avg. Daily Returns (1 Values Determination Volatility (%) year) (%) (R2)
Weightage (%) in S&P SENSEX
500010 500087
HDFC CIPLA LTD.
0.98 0.62
0.32 0.12
1.43 1.46
38.27 20.06
8.15 1.15
500103
500112 500124 500180 500182 500209 500312
BHEL
STATE BANK OF INDIA [Link]'S LABORATORIE S LTD. HDFC BANK LTD. HERO INFOSYS LTD. ONGC RELIANCE
1.34
1.41 0.33 1.02 0.68 0.81 0.93
0.33
0.43 0.05 0.48 0.16 0.09 0.24
1.94
1.77 1.23 1.21 1.41 2.25 1.56
-3.27
1.06 26.82 42.53 -5.01 0.83 32.70
1.02
3.32 1.58 7.91 1.03 6.97 4.15
Measured by fitting a reg eqn
y = + x
y = % daily change in the stock price x = % daily change in the market index = covariance (x,y) Var (x) where x = index returns; y = stock returns
= covariance (x,y) Var (x) = yx nxy x2 - nx2
Q
The following data relates to the closing BSE Sensex and the stock price of RIL for 10 trading days. Calculate the measure.
BSE Index 12389 12373 12366 12364 12353 12538 12736 12928 12884 12858
Stock price of Reliance Industries 1155.05 1163.05 1154.1 1150.5 1143.2 1169.5 1190.15 1213.4 1216.05 1208
x = % daily change in the market index
X= (value of BSE index on day 2 day 1) *
100 / value on day 1 X=(12373-12389) * 100 / 12389 = - 1600/ 12389 = - 0.1291
BSE 12389 12373 12366 12364 12353 12538 12736 12928 12884 12858
Reliance Industries BSE (x) 1155.05 1163.05 -0.12914682 1154.1 -0.0565748 1150.5 -0.01617338 1143.2 -0.08896797 1169.5 1.497611916 1190.15 1.579199234 1213.4 1.507537688 1216.05 -0.34034653 1208 -0.20180068
Reliance Industries (y) 0.692610709 -0.769528395 -0.311931375 -0.634506736 2.300559832 1.765711843 1.953535269 0.218394594 -0.661979359
Covariance ( x , y ) = 0.817
Variance ( x ) = 0.626
Beta of Shares = Covariance (x,y)/ Var ( x )
= 1.306 This implies that the Reliance Industries Stock is 30.647 % more aggressive than BSE
Assignment in groups ( marks 4 ) Solve in Excel Show solution group wise
The following data gives the closing prices of BSE Sensex, and the stock prices of these individual companies viz ICICI Bank, L&T and RIL for 10 trading days.
Find r between the stock prices of the companies and comment - ICICI & RIL - ICICI and L&T - L&T and RIL Also, measure of all three and comment
Date 6/3/2006 7/3/2006 8/3/2006 9/3/2006 10/3/2006 13-3-2006 16-3-2006 17-3-2006 20-3-2006 21-3-2006
BSE ICICI Bank Sensex 10735 613.2 10725 600.65 10509 590.55 10574 601.75 10765 612.9 10804 603.1 10878 607.5 10860 605.25 10941 605.4 10905 597.8
Reliance Industries 731.9 731.85 719.45 726.7 732.15 732.35 768.5 774.85 776.2 780.3
L&T 2413.4 2493.8 2439.4 2442.35 2512.65 2536.25 2507.35 2466.75 2461.45 2466.05
(i) 0.104378 (ii) 0.146146 (iii) 0.073593 (i) Beta of ICICI stock = 1.0273 This implies that the ICICI Stock is 2.73% more aggressive than BSE (ii)Beta of Reliance Industries stock = 0.8523 (1-0.8523).This implies that the Reliance Industries Stock gives 14.77 % less returns than BSE (iii)Beta of L & T stock = 0.981 This implies that the L & T Stock gives 1.866 % less returns than BSE
Time Series Analysis
A time series is a set of numerical values of some variable obtained at regular intervals of time.
Role
Helps in the understanding of past
behaviour Helps in planning future operations Helps in evaluating current accomplishments Facilitates comparisons
Components
Secular Trend Cyclical Variations Irregular Variations
Seasonal Variations
Y = TSCI
.20
15 10 5 0 1994 96 98 billion bottles 2000 2 4 6
Linear (billion bottles)
Secular Trend
A type of variation in time series that reflects a long term movement in time series over a long period of time. For eg. Declining death rate, upward trend in population, prices, migration from cities to towns. Linear trend Non linear trend
Seasonal Variations
Periodic movements in business activity which occur regularly every year and have their origin in the nature of the year itself. Factors : Climate and weather conditions Customs, traditions and habits
Cyclical Variations
business cycle 60 50 40 30 20 10 0 decline prosperity improvement
Irregular Variations
Rapid changes in the data caused by short
term unanticipated and non recurring factors. For eg. hurricanes, typhoons, riots etc.
Straight Line Trend Methods Of Measurement
Freehand or Graphic method
The semi average method
The method of least squares
Freehand Or Graphic Method
Fit a trend line to the following data by the freehand method : Year Prod. of sugar(in Year Prod. of sugar (in million tonnes) million tonnes) 20 2005 25
2000 2001
2002 2003 2004
22
24 21 23
2006 23
2007 26 2008 25
30
25
20 production Linear (production)
15
10
0 2000 2002 2004 2006 2008
Merits
Very simple
Limitations Subjective Time consuming
Method of semi averages
Data is divided into 2 equal parts Even no. divide into parts Odd no. of years leave middle year Take average of two parts. Plot each point at the mid point of the class interval and join the two parts.
q
fit a trend line by the method of semi
averages
Year 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000
Production (in 00 tonnes) 3.6 0.8 1.9 2.9 2.7 4.7 5.3 6.5 5.2 5.9 7.1
Year
2001 2002 2003 2004 2005 2006 2007 2008
production
6.7 7.0 7.9 7.4 10.8 9.2 10.5 15.5
2009
2010 2011
13.7
16.7 15.0
Mean of first 11 entries = 4.236
(first trend value) Mean of second 11 entries = 10.945 (second trend point) Make a graph. Join the two points. Extend line on both sides.
Fit a trend line to the following data by the method of semi averages :
Year 2002 2003 2004 2005 2006 2007 2008 Sales (thousand units) 102 105 114 110 108 116 112
108+ 116 + 112 =112 3 102+ 105 + 114= 107 3 120
115 110 105 100 95 2002 2003 2004 2005 2006 2007 2008
Method of Least Squares
Y=a + bX
Y = Na +bX XY = aX + bX When X = 0; a=Y b=XY X2
Below are given the figures of production (in [Link]) of a sugar factory. Fit a straight line trend to these figures by MLS. Estimate the likely sales of the company during 2009.
Year 2002 03 prod 80 90 04 92 05 83 06 94 07 99 08 92
Year 2002 2003 2004 2005
prod Dev Yr XY
Trend values
Y
80 90 92 83
X
-3 -2 -1 0
XY
-240 -180 -92 0
X
9 4 1 0
Yc
84 86 88 90
2006
2007 2008 N=7
94
99 92 630
+1
+2 +3 0
+94
+198 +276 56
1
4 9 28
92
94 96 630
Yc=a + b X
a=Y = 630 = 90
N 7 b = XY = 2 X Yc=90 + 2 X For X = -3 : Yc= 90 + 2(-3) = 84 For X = -2 : Yc= 90 + 2(-2) = 86 For X = -1 : Yc= 90 + 2(-1) = 88
For 2009; X = +4 Yc = 90 + 2(4) = 98 The likely production for 2009 is 98 tonnes
Non Linear Trend
Freehand or graphic Method
Moving Average method
Parabolic Trend
Method of Moving Averages
Average value for a no. of years is taken, and this average is taken as the trend value for that unit of time falling at the middle of the period covered in the calculation of the average.
Method of Moving Averages
Average value for a no. of years is taken, and this average is taken as the trend value for that unit of time falling at the middle of the period covered in the calculation of the average. a+b+c, b+c+d, c+d+e, 3 3 3
Calculate the 5 yearly moving average for the following data of a number of commercial industrial failures in a country during 1993 to 2008. Year No. of Year No. of failures failures 1993 23 2001 9 1994 26 2002 13 1995 28 2003 11 1996 32 2004 14 1997 20 2005 12 1998 12 2006 9 1999 12 2007 3 2000 10 2008 1
YEAR
No. of failures
5 yearly moving totals
5 yearly moving average
1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006
23 26 28 32 20 12 12 10 9 13 11 14 12 9
129 118 104 86 63 56 55 57 59 59 49 39
25.8 = 26 24 21 17 13 11 11 11 12 12 10 8
year
cargo
4 yearly moving totals
4 yearly mov avg
4 yrly centred moving avg
1997 1102 1998 1250 4872 1999 1180 4982 1245.50 1218 1231.75
2000 1340
5049 2001 1212 1262.25
1253.87
1296.25
5321
2002 1317 5530
1330.25
1356.37 1382.50
Utility
Gives a smoother curve
Lessens the effect of fluctuations
Simple
Limitations Cannot be used for forecasting Judgemental selection of time period for moving average
Q
Calculate the monthly trend values by
method of centred moving averages for the data given for 4 quarters at a time.
Year
2003
quarter
March
deposits
391
June
September December 2004 March June September December
439
452 480 509 562 572 622
2005
March
June September December
625
685 687 745 743 808 805 867
2006
March June September December
Year
2003
Quarter
March June
deposits
391 439
4 monthly total
Centred moving average
1762
September
December 2004 March June September
452
1880 480 2003 509
455
485 516
2123
562 2265 572 2381 581 549
December
622
611
Year
2005
Quarter
March
deposits
625
4 monthly total
Centred moving average 640
2619 June 685 2742 September 687 2860 December 745 2983 2006 March 743 3101 June 808 3223 September 805 790 761 730 700 670
Second Degree Parabola
Yc = a + bX + cX Y = Na+ bX + cX XY= aX+bX+ +cX XY = aX+bX +cX4
When X=0,X=0 a=Y - cX ; b=XY ; N X c= NXY - XY NX4 (X)
Fit a parabolic curve of the second degree to the following data and estimate the value for
2010 and comment on it.
Year 2004 2005 12 2006 13 2007 10 2008 8
Sales (in 10 000 Rs)
Year
Sales X Y 2004 10 -2 2005 12 2006 13 2007 10 2008 8 N=5 53 -1 0 +1 +2 0
XY -20 -12 0 +10 +16 -6
X 4 1 0 1 4 10
XY 40 12 0 10 32 94
X4 16 1 0 1 16 34
Yc = a + bX + cX
b=XY ; = -6 = - 0.6 X 10
c= NXY - XY = 5 94 - 1053 = -0.857 NX4 (X) 534 (10)
a=Y - cX ; = 12.3 N
Yc = 12.3-0.6X 0.857X
For 2010, X = +4 Yc = 12.3-0.6(4) 0.857(4) = 12.3-2.4-13.7 = -3.812 As the prescribed sale is negative, the second degree parabola does not seem to describe the data well.
Seasonal Variations
1. Method of Simple Averages
2. Ratio to trend method
SV Method Of Simple Averages
1. Average the unadjusted data by years and 2. 3. 4. 5.
months Find totals of January, February etc. Divide each total by no. of time periods Obtain an average of averages Calculate %
Consumption of monthly electric power in million of kw hours for street lighting in Bangalore during 2004-2008 is given [Link] out the seasonal variation by the method of monthly averages
Yr J
04 318 281 278 250 231 216 223 245 269 302 325 347 05 342 309 299 268 249 236 242 262 288 321 342 364 06 367 328 320 287 269 251 259 284 309 345 367 394 07 392 349 342 311 290 273 282 305 328 364 389 417 08 420 378 370 334 314 296 305 330 356 396 422 452
Consumption of monthly elec power
Mnth Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2004 318 281 278 250 231 216 223 245 269 302 325 347
2005 342 309 299 268 249 236 242 262 288 321 342 364
2006 367 328 320 287 269 251 259 284 309 345 367 394
2007 392
08 420
Totals 5 yrs
5 yrly avg 367.8 329.0 321.8 290.0 270.6 254.4 262.2 285.2 310.0 345.6 369.0 394.8 3800.4 316.7
% 116.1 103.9 101.6 91.6 85.4 80.3 82.8 90.1 97.9 109.1 116.5 124.7 1200 100
349
342 311 290 273 282 305 328 364 389
417
1839 378 1645 370 1609 334 1450 314 1353 296 1272 305 1311 330 1426 356 1550 396 1728 422 1845 452 1974 Total 19002
Merits and Limitations
Simple
Assumes there is no trend
Ratio to Trend Method
Y = TSCI TSCI = SCI = S T C I
Find the seasonal variation by ratio to trend method from the data given below Year 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr 2004 30 40 36 34
2005 2006 2007 2008
34 40 54 80
52 58 76 92
50 54 68 86
44 48 62 82
Year
2004 2005 2006 2007 2008
Yearly totals 140 180 200 260 340
Yr avg Y 35 45 50 65 85 280
Dev X -2 -1 0 +1 +2 0
Trend
XY -70 -45 0 +65 +170 120 X 4 1 0 1 4 10
Values
32 44 56 68 80
Y= a + bX
As X = 0 a=Y = 56; b=XY = 120 = 12 N X 10 Yc= a + bX Yc = 56 + 12(-2) = 32 Yc = 56 + 12(-1) = 44
Quarterly increment = 12 / 4 = 3
Trend value for the middle qtr i.e half of 2nd and half of 3rd qtr is 32. Trend value of 2nd qtr is 32-3/2 = 30.5 Trend value of 3rd qtr is 32+3/2 = 33.5 Trend value of 1st qtr is 30.5-3 = 27.5 Trend value of 4th qtr is 33.5+3 = 36.5
Trend Values
Year 2004 1st Qtr 27.5 2nd Qtr 30.5 3rd Qtr 33.5 4th Qtr 36.5
2005 2006 2007 2008
39.5 51.5 63.5 75.5
42.5 54.5 66.5 78.5
45.5 57.5 69.5 81.5
48.5 60.5 72.5 84.5
Quarterly values as % of Values
1st Q 109.1 (30/27.5 * 100) 2005 86.1 2006 77.7 2007 85.0 2008 106.0 Total 463.9 Average 92.78 S Index 92.0 adjusted Year 2004 2nd Q 131.1 (40/30.5 *100) 122.4 106.4 114.3 117.1 591.3 118.26 117.4 3rd Q 107.5 109.9 93.9 97.8 105.5 514.6 102.92 102.1 4th Q 93.1 90.7 79.3 85.5 97.0 445.6 89.12 88.4
Total of averages = 403.08
As this is more than 400, an adjustment is
made : 400 / 403.8 = 0.992 Thus, Seasonal Index Adjusted for 1st quarter = 92.78/ 100.77 100.77 is average of total averages
Cyclical Variations
Residual Method Reference cycle analysis method Direct method Harmonic analysis method
Irregular Variations
Y = TSCI TSCI = SCT I
Assignment 1 ( 5 marks, individual)
What do you understand by Time series
Analysis ? Write a short note on the measurement of Cyclical variations and irregular variations. Illustrate with an example Submit to CR by 4 pm, 26th July, 2013.
Assignment (in groups)
Q1 = 3 marks
Q2 = 5 marks (round off till 3 dec places)
Working on Excel is optional
Q
Obtain the straight line trend equation on
the given time series data. Calculate the trend values for all 10 years. Also, fit a straight line to the graph by free hand. Forecast trend for the year 2001 by both methods
Years 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000
Annual Power Used (in units) 15 20 21 25 28 26 30 32 40 38
Take deviation from 1995.5
Soln:
Years 1991 1992 1993 Annual Power Used (in units) 15 20 21 Trend 16.16
1994 1995
1996 1997 1998 1999 2000
25 28
26 30 32 40 38
Q2
For the data given on the next slide
- obtain 12 monthly centred moving
averages. - Construct a seasonal index using ratio to trend method. - Forecast the seasonal values for the year 1996 Round off till three decimal places.
Months 1991 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 5 7 9 6 6 5 6 8 9 7 8 9
1992 7 8 10 8 8 7 8 10 11 9 10 11
1993 10 9 9 9 10 9 9 12 13 11 12 13
1994 8 7 11 11 11 10 11 11 10 13 14 14
1995 9 10 12 9 10 11 12 13 11 12 12 13