Assignment 2
Assignment 2
In this assignment we aim to Analyze the GDP per capita of one of the happiest countries in the
World, Norway. We carry out the analysis for data of last 15 years from 2005 to 2019 and use GDP
per capita (constant 2010 US$)1 . Our objective is to find the constant annual growth rate for GDP per
capita by making use of a regression model followed by analysis of the significance of parameters,
using graphs to understand the trend followed by GDP per capita and residuals in regression and in
the end predicting the value of GDP per capita for the year 2020. With this assignment we wish to
apply and polish the methods studied in Econometrics theoretically till date.
Trend graph
93000 11.44
92000 11.43
11.42
91000
11.41
lnY
Y
90000
11.4
89000
11.39
88000 11.38
87000 11.37
0 2 4 6 8 10 12 14 16
Time
Y lnY
Fig -1
As we can observe from fig-1, Y and lnY follow the same pattern over the 15 years in
consideration, this happens because Log transformation is a linear transformation of Y due
1
Data collected form World Bank’s World Development Indicators:
https://data.worldbank.org/country/norway
to which the pattern remains unchanged. Another thing to note here is that Time trend
followed by both Y and lnY shows a somewhat linear relationship from period 7 onwards but
not before that.
c) To fit a constant growth curve and estimate the annual growth rate we are taking into
consideration 15 years (2005-2019) and to carry out the regression we transform this time
trend variable such that it takes values from 1 to 15, as shown in the table below.
Table-1
GDP per capita in Norway, 2005-2019
Year Y t lnY tlnY
Using the data in the table we can calculate the value of 𝛼̂ and 𝛽̂ ,
∑𝑡 ln 𝑦−𝑛𝑡 ̅ ln 𝑦̅ 0.547499
𝛽̂ = 2 ( ̅)2
= = 0.001955 (c.1)
∑𝑡 −𝑛 𝑡 280
𝛼̂ = 𝑙𝑛 𝑦̅ − 𝛽̂ 𝑡̅ = 11.38927416
Hence, we can conclude that 𝛽̂ and ĝ both represent the estimate of annual growth rate,
which is approximately equal to 0.19% per annum.
d) Table-2
Observed and Fitted dependent variable and estimated error terms
t lnY tlny t2 Fitted Y ν tν
1 11.38999619 11.38999619 1 11.39122952 -0.001233331 -0.001233331
2 11.40565674 22.81131349 4 11.39318487 0.012471871 0.024943742
3 11.42481239 34.27443717 9 11.39514023 0.029672165 0.089016494
4 11.41710155 45.6684062 16 11.39709558 0.020005969 0.080023876
5 11.3870692 56.93534602 25 11.39905094 -0.011981731 -0.059908655
6 11.38160637 68.2896382 36 11.40100629 -0.019399923 -0.116399537
7 11.37840132 79.64880922 49 11.40296164 -0.024560328 -0.171922293
8 11.39193877 91.13551014 64 11.404917 -0.012978231 -0.103825848
9 11.39013498 102.5112148 81 11.40687235 -0.016737372 -0.150636347
10 11.39836161 113.9836161 100 11.40882771 -0.010466096 -0.104660958
11 11.40789107 125.4868018 121 11.41078306 -0.002891993 -0.031811918
12 11.40973996 136.9168795 144 11.41273842 -0.00299846 -0.035981526
13 11.42463004 148.5201905 169 11.41469377 0.009936269 0.1291715
14 11.43084218 160.0317905 196 11.41664913 0.01419305 0.198702694
15 11.43557262 171.5335893 225 11.41860448 0.01696814 0.254522106
sum 120 171.073755 1369.137539 1240 171.073755 3.73035E-14 3.01981E-13
Table -3
Formula As given in Table-3, we can see
value
e)
11.42
11.4
11.38
11.36
11.34
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Time
lnY Fitted Y
Fig-2
The correlation we observe between Fitted regression line and data is positive but weak, which was
also indicated by fig-1 and low value of R 2 due to the dip.
f) Table-4
value variance formula variance standard error
2
𝑡̅ 2
intercept 11.38927416 8.45841E-05 0.009196964
𝑛 ∑ 𝑡 − 𝑡̅ 2
2
slope coefficient 0.001955355 1.0232E-06 0.001011531
∑ 𝑡 − 𝑡̅ 2
Now to test the significance of Intercept and Slope coefficient, we’ll first construct the null
hypothesis and find the corresponding t-statistic. Starting with the intercept,
H0 : 𝛼̂ = 0
HA : 𝛼̂ ≠ 0
Now at 5% level of significance the value of t-tabulated is 2.532637815 and at 10% it is 2.160368656,
which means t-calculated lies in critical region for both the levels of significance, which means we
will reject the null hypothesis, thus, at 5% and 10% level of significance the intercept is significant.
Repeating the similar steps for analysing the significance of Slope coefficient,
H0 : 𝛽̂ = 0
HA: 𝛽̂ ≠ 0
Now,
̂ −𝛽
𝛽 0.001955
(f.2)
t=
𝑠ⅇ(𝛽̂ ) = 0.001011531 = 1.933064011
So given the t-tabulated values at 5% and 10% level of significance (as mentioned above), we can see
that t-calculated doesn’t lie in the critical region, as a result of which we will not reject the null
hypothesis and Slope coefficient is not significantly different from zero at 5% and 10% level of
significance.
With the t-statistics of both Intercept and slope coefficient known, we can write the regression
equation as,
With this information we can write the 95% confidence interval for intercept α as,
Now using the t-calculated above in equations (f.1) and (f.2), we can find the p-value for intercept
and slope coefficients as,
Table-5
p-value
Intercept 2.34511E-34
Slope Coefficient 0.075307356
By comparing the value of level of significance and the p-value we can conclude that, as the p-value
of the intercept is less than the level of significance (5% in this case), we will reject the null
hypothesis thus concluding that the intercept is significantly different from zero. On the other hand,
the p-value is greater than the level of significance for the slope coefficient, as a result which we will
not reject the null hypothesis and slope coefficient is not significantly different from zero.
Thus, being approximately equal to zero and satisfying the normal equations.
Error Trend
0.04
0.03
0.02
0.01
Error
0
0 2 4 6 8 10 12 14 16
-0.01
-0.02
-0.03
Time
Fig-3
The graph plotted between Error/Residual terms and time shows a trend similar to that
observed in fig-1 plotted for lnY against time. The reason for this can be some variables
which are affecting the GDP but haven’t been included in the regression as a regressor,
which is also supported by the low value of R 2 . Hence, the error terms are correlated with
one another.
To predict the value of lnY of year 2020, we can use the equation (f.3). Putting t = 16, we get lnY16 =
11.42055983, which means GDP per capita for year 2020 (Y16 ) was US$91,177.17 as predicted by the
econometric model.
Now using our Mathematical model in equation (a.1), we can put the growth rate we found in
equation (c.3) and t = 16. We get the value of Y16 as,
Here we can clearly see that the values predicted for year 2020 by econometric and mathematical
model are different and mathematical model has predicted a higher value of GDP per capita for the
said year.
Now, to find 95% confidence interval for lnY given t=16 we’ll make use of the formula,
2
(𝑡𝑝 − 𝑡)̅
𝑦̂ ± 𝑡𝛼∕2 ( ν)√ [ ]
𝑛 ∑(𝑡 − 𝑡)̅ 2
where 𝑦̂ = 𝛼̂ ̂ 𝑡𝑝
𝛽
Which gives us the value of confidence interval from US$ 11.39726726 to US$ 11.44385241.
Now from the regression analysis carried out by us on the Per capita GDP of Norway, we can
understand that some of the important regressors haven’t been included in our model, due to which
our model faces some serious flaws like; Log of GDP per capita not being linearly related with time
(But the model chosen itself being linear), Slope coefficient not being statistically significant and
error terms being correlated. Now even though low value of R 2 isn’t indicative of a bad model per se,
but in our example, because of all the other problems, we can say that low R 2 is indeed indicative of
a faulty model. But irrespective of these flaws, with this assignment, we got a hands-on experience
in terms of usage of Econometric concepts.