Time Series
Time Series
TIME-SERIES ANALYSIS
EXAM FOCUS
A me series is a set of observa ons of a random variable spaced evenly through me (e.g., quarterly
sales revenue for a company over the past 60 quarters). For the exam, given a regression output,
iden fying viola ons such as heteroskedas city, nonsta onarity, serial correla on, etc., will be
important, as well as being able to calculate a predicted value given a me-series model. Know why a
log-linear model is some mes used; understand the implica ons of seasonality and how to detect and
correct it, as well as the root mean squared error (RMSE) criterion.
LOS 2.a: Calculate and evaluate the predicted trend value for a me series, modeled as
either a linear trend or a log-linear trend, given the es mated trend coefficients.
A me series is a set of observa ons for a variable over successive periods of me (e.g., monthly stock
market returns for the past 10 years). The series has a trend if a consistent pa ern can be seen by
plo ng the data (i.e., the individual observa ons) on a graph. For example, a seasonal trend in sales
data is easily detected by plo ng the data and no ng the significant jump in sales during the same
month(s) each year.
Ordinary least squares (OLS) regression is used to es mate the coefficient in the trend line, which
provides the following predic on equa on:
Don’t let this model confuse you. It’s very similar to the simple linear regression model we covered
previously; only here, (t) takes on the value of the me period. For example, in period 2, the equa on
becomes:
Calculate .
Answer:
When
When
Note that the difference between is 3.0, or the value of the trend coefficient b1.
Consider hypothe cal me series data for manufacturing capacity u liza on.
Applying the OLS methodology to fit the linear trend model to the data produces the results shown
here.
Answer:
As shown in the regression output, the es mated intercept and slope parameters for our
manufacturing capacity u liza on model are , respec vely. This means that
the predic on equa on for capacity u liza on can be expressed as:
With this equa on, we can generate es mated values for capacity u liza on, , for each of the 14
quarters in the me series. For example, using the model capacity u liza on for the first quarter of
2020 is es mated at 81.914:
Note that the es mated value of capacity u liza on in that quarter (using the model) is not exactly
the same as the actual, measured capacity u liza on for that quarter (82.4). The difference between
the two is the error or residual term associated with that observa on:
Note that since the actual, measured value is greater than the predicted value of y for 2020.1, the
error term is posi ve. Had the actual, measured value been less than the predicted value, the error
term would have been nega ve.
The projec ons (i.e., values generated by the model) for all quarters are compared to the actual
values here.
The following graph shows visually how the predicted values compare to the actual values, which
were used to generate the regression equa on. The residuals, or error terms, are represented by the
distance between the predicted (straight) regression line and the actual data plo ed in blue. For
example, the residual for t=10 is 81.9 − 79.907=1.993.
This model defines y, the dependent variable, as an exponen al func on of me, the independent
variable. Rather than try to fit the nonlinear data with a linear (straight line) regression, we take the
natural log of both sides of the equa on and arrive at the log-linear model. This is frequently used when
me series data exhibit exponen al growth.
Now that the equa on has been transformed from an exponen al to a linear func on, we can use a
linear regression technique to model the series. The use of the transformed data produces a linear
trend line with a be er fit for the data and increases the predic ve ability of the model.
An analyst es mates a log-linear trend model using quarterly revenue data (in millions of $) from the
first quarter of 2012 to the fourth quarter of 2023 for JP Northfield, Inc.:
Answer:
In the first quarter of 2024, t is equal to 49 because the sample has 48 observa ons.
The first answer you get in this calcula on is the natural log of the revenue forecast. In order to turn
the natural log into a revenue figure, you use the 2nd func on of the LN key (ex) on your BA II Plus:
enter 8.41 and press [2nd] ex = 4,492 million.
LOS 2.b: Describe factors that determine whether a linear or a log-linear trend should
be used with a par cular me series and evaluate limita ons of trend models.
The le panel is a plot of data that exhibits exponen al growth along with a linear trend line. The panel
on the right is a plot of the natural logs of the original data and a representa ve log-linear trend line.
The log-linear model fits the transformed data be er than the linear trend model and, therefore, yields
more accurate forecasts.
The bo om line is that when a variable grows at a constant rate, a log-linear model is most appropriate.
When the variable increases over me by a constant amount, a linear trend model is most appropriate.
Limitations of Trend Models
Recall that one of the assump ons underlying linear regression is that the residuals are uncorrelated
with each other. A viola on of this assump on is referred to as autocorrela on. For AR models (where
the lagged dependent variable is an independent variable), presence of serial correla on in residuals
indicates that the model is misspecified. This is a significant limita on, as it means that the model is not
appropriate for the me series and that we should not use it to predict future values.
In the preceding discussion, we suggested that a log-linear trend model would be be er than a linear
trend model when the variable exhibits a constant growth rate. However, it may be the case that even a
log-linear model is not appropriate in the presence of serial correla on. In this case, we will want to
turn to an autoregressive model.
Recall from the previous topic review that the Durbin-Watson sta s c (DW) is used to detect
autocorrela on. For a me series model without serial correla on DW should be approximately equal
to 2.0. A DW significantly different from 2.0 suggests that the residual terms are correlated.
Consider the results of the regression of monthly real estate loans (RE) in billions of dollars by commercial
banks over the period January 2020 through September 2023 in the following table:
When the dependent variable is regressed against one or more lagged values of itself, the resultant
model is called as an autoregressive model (AR). For example, the sales for a firm could be regressed
against the sales for the firm in the previous month.
Consider:
In an autoregressive me series, past values of a variable are used to predict the current (and hence
future) value of the variable.
Sta s cal inferences based on ordinary least squares (OLS) es mates for an AR me series model may
be invalid unless the me series being modeled is covariance sta onary.
A me series is covariance sta onary if it sa sfies the following three condi ons:
1. Constant and finite expected value. The expected value of the me series is constant over me.
(Later, we will refer to this value as the mean-rever ng level.)
2. Constant and finite variance. The me series’ vola lity around its mean (i.e., the distribu on of the
individual observa ons around the mean) does not change over me.
3. Constant and finite covariance between values at any given lag. The covariance of the me series
with leading or lagged values of itself is constant.
LOS 2.d: Describe the structure of an autoregressive (AR) model of order p and
calculate one- and two-period-ahead forecasts given the es mated coefficients.
The following model illustrates how variable x would be regressed on itself with a lag of one and two
periods:
where p indicates the number of lagged values that the AR model will include as independent
variables.
Note that the ^ symbol above the variables in the equa ons indicates that the inputs used in mul -
period forecasts are actually forecasts (es mates) themselves. This implies that mul -period forecasts
are more uncertain than single-period forecasts. For example, for a two-step-ahead forecast, there is
the usual uncertainty associated with forecas ng xt+1 using xt, plus the addi onal uncertainty of
forecas ng xt+2 using the forecasted value for xt+1.
EXAMPLE: Forecas ng
Suppose that an AR(1) model has been es mated and has produced the following predic on
equa on: xt = 1.2 + 0.45xt-1. Calculate a two-step-ahead forecast if the current value of x is 5.0.
Answer:
One-step-ahead forecast: .
Two-step-ahead forecast: .
LOS 2.e: Explain how autocorrela ons of the residuals can be used to test whether the
autoregressive model fits the me series.
PROFESSOR’S NOTE
The Durbin-Watson test that we used with trend models is not appropriate for tes ng for
serial correla on of the error terms in an autoregressive model. Use this t-test instead.
The correla ons of the error terms from the es ma on of an AR(1) model using a sample with 102
observa ons are presented in the following figure. Determine whether the model is correctly
specified.
Autocorrela on Analysis
Answer:
In this example, the standard error is or 0.099. The t-sta s c for Lag 2 is then computed as
0.0843368 / 0.099 = 0.8518.
The cri cal two-tail t-value at the 5% significance level and 100 degrees of freedom is 1.98. The t-
sta s cs indicate that none of the autocorrela ons of the residuals in the previous figure is
sta s cally different from zero because their absolute values are less than 1.98. Thus, there is
sufficient reason to believe that the error terms from the AR(1) model are not serially correlated.
If the t-tests indicate that any of the correla ons computed in Step 2 are sta s cally significant (i.e., t
≥ 1.98), the AR model is not specified correctly. Addi onal lags are included in the model and the
correla ons of the residuals (error terms) are checked again. This procedure will be followed un l all
autocorrela ons are insignificant.
A me series exhibits mean reversion if it has a tendency to move toward its mean. In other words, the
me series has a tendency to decline when the current value is above the mean and rise when the
current value is below the mean. If a me series is at its mean-rever ng level, the model predicts that
the next value of the me series will be the same as its current value (i.e., when a me series is
at its mean-rever ng level).
EXAMPLE: Mean-rever ng me series
Calculate the mean-rever ng level for the manufacturing capacity u liza on me series using the
following regression results:
Answer:
This means that if the current level of manufacturing capacity u liza on is above 67.16, it is expected
to fall in the next period, and if manufacturing capacity u liza on is below 67.16 in the current
period, it is expected to rise in the next period.
All covariance sta onary me series have a finite mean-rever ng level. An AR(1) me series will have a
finite mean-rever ng level when the absolute value of the lag coefficient is less than 1 (i.e., |b1| < 1).
LOS 2.g: Contrast in-sample and out-of-sample forecasts and compare the forecas ng
accuracy of different me-series models based on the root mean squared error
criterion.
In-sample forecasts are within the range of data (i.e., me period) used to es mate the model,
which for a me series is known as the sample or test period. In-sample forecast errors are ,
where t is an observa on within the sample period. In other words, we are comparing how accurate our
model is in forecas ng the actual data we used to develop the model. The Predicted vs. Actual Capacity
U liza on figure in our Trend Analysis example shows an example of values predicted by the model
compared to the values used to generate the model.
Out-of-sample forecasts are made outside of the sample period. In other words, we compare how
accurate a model is in forecas ng the y variable value for a me period outside the period used to
develop the model. Out-of-sample forecasts are important because they provide a test of whether the
model adequately describes the me series and whether it has relevance (i.e., predic ve power) in the
real world. Nonetheless, an analyst should be aware that most published research employs in-sample
forecasts only.
The root mean squared error (RMSE) criterion is used to compare the accuracy of autoregressive
models in forecas ng out-of-sample values. For example, a researcher may have two autoregressive
(AR) models: an AR(1) model and an AR(2) model. To determine which model will more accurately
forecast future values, we calculate the RMSE (the square root of the average of the squared errors) for
the out-of-sample data. Note that the model with the lowest RMSE for in-sample data may not be the
model with the lowest RMSE for out-of-sample data.
For example, imagine that we have 60 months of historical unemployment data. We es mate both
models over the first 36 of 60 months. To determine which model will produce be er (i.e., more
accurate) forecasts, we then forecast the values for the last 24 of 60 months of historical data. Using the
actual values for the last 24 months as well as the values predicted by the models, we can calculate the
RMSE for each model.
The model with the lower RMSE for the out-of-sample data will have lower forecast error and will be
expected to have be er predic ve power in the future.
In addi on to examining the RMSE criteria for a model, we will also want to examine the stability of
regression coefficients, which we discuss in the following.
Financial and economic me series inherently exhibit some form of instability or nonsta onarity. This is
because financial and economic condi ons are dynamic, and the es mated regression coefficients in
one period may be quite different from those es mated during another period.
Models es mated with shorter me series are usually more stable than those with longer me series
because a longer sample period increases the chance that the underlying economic process has
changed. Thus, there is a tradeoff between the increased sta s cal reliability when using longer me
periods and the increased stability of the es mates when using shorter periods.
The primary concern when selec ng a me series sample period is the underlying economic processes.
Have there been regulatory changes? Has there been a drama c change in the underlying economic
environment?
If the answer is yes, then the historical data may not provide a reliable model. Merely examining the
significance of the autocorrela on of the residuals will not indicate whether the model is valid. We must
also examine whether the data is covariance sta onary.
Random walk. If a me series follows a random walk process, the predicted value of the series (i.e., the
value of the dependent variable) in one period is equal to the value of the series in the previous period
plus a random error term.
A me series that follows a simple random walk process is described in equa on form as xt = xt-1 + εt,
where the best forecast of xt is xt-1 and:
1. E(εt) = 0: The expected value of each error term is zero.
2. E(εt2) = σ2: The variance of the error terms is constant.
3. E(εiεj) = 0; if i ≠ j: There is no serial correla on in the error terms.
Random walk with a dri . If a me series follows a random walk with a dri , the intercept term is not
equal to zero. That is, in addi on to a random error term, the me series is expected to increase or
decrease by a constant amount each period. A random walk with a dri can be described as:
Covariance sta onarity. Neither a random walk nor a random walk with a dri exhibits covariance
sta onarity. To show this, let’s start by expressing a random walk as:
In either case (with or without a dri ), the mean-rever ng level is (the division of any
number by zero is undefined), and as we stated earlier, a me series must have a finite mean-rever ng
level to be covariance sta onary. Thus, a random walk, with or without a dri , is not covariance
sta onary, and exhibits what is known as a unit root (b1 = 1). For a me series that is not covariance
sta onary, the least squares regression procedure that we have been using to es mate an AR(1) model
will not work without transforming the data. We discuss unit roots and how they are handled in the
next sec on.
LOS 2.j: Describe implica ons of unit roots for me-series analysis, explain when unit roots are likely to
occur and how to test for them, and demonstrate how a me series with a unit root can be transformed
so it can be analyzed with an AR model.
LOS 2.k: Describe the steps of the unit root test for nonsta onarity and explain the rela on of the test
to autoregressive me-series models.
As we discussed in the previous LOS, if the coefficient on the lag variable is 1, the series is not
covariance sta onary. If the value of the lag coefficient is equal to one, the me series is said to have a
unit root and will follow a random walk process. Since a me series that follows a random walk is not
covariance sta onary, modeling such a me series in an AR model can lead to incorrect inferences.
Unit Root Testing for Nonstationarity
To determine whether a me series is covariance sta onary, we can (1) run an AR model and examine
autocorrela ons, or (2) perform the Dickey-Fuller test.
In the first method, an AR model is es mated and the sta s cal significance of the autocorrela ons at
various lags is examined. A sta onary process will usually have residual autocorrela ons insignificantly
different from zero at all lags or residual autocorrela ons that decay to zero as the number of lags
increases.
A more defini ve test for unit root is the Dickey-Fuller test. For sta s cal reasons, you cannot directly
test whether the coefficient on the independent variable in an AR me series is equal to 1. To
compensate, Dickey and Fuller created a rather ingenious test for a unit root. Remember, if an AR(1)
model has a coefficient of 1, it has a unit root and no finite mean rever ng level (i.e., it is not covariance
sta onary). Dickey and Fuller (DF) transform the AR(1) model to run a simple regression. To transform
the model, they (1) start with the basic form of the AR(1) model and (2) subtract xt-1 from both sides:
Then, rather than directly tes ng whether the original coefficient is different from 1, they test whether
the new, transformed coefficient (b1 − 1) is different from zero using a modified t-test. If (b1 − 1) is not
significantly different from zero, they say that b1 must be equal to 1.0 and, therefore, the series must
have a unit root.
PROFESSOR’S NOTE
In their actual test, Dickey and Fuller use the variable g, which equals (b1 − 1). The null
hypothesis is g = 0 (i.e., the me series has a unit root). For the exam, understand how the
test is conducted and be able to interpret its results. For example, if on the exam you are told
the null (g = 0) cannot be rejected, your answer is that the me series has a unit root. If the
null is rejected, the me series does not have a unit root.
First Differencing
If we believe a me series is a random walk (i.e., has a unit root), we can transform the data to a
covariance sta onary me series using a procedure called first differencing. The first differencing
process involves subtrac ng the value of the me series (i.e., the dependent variable) in the
immediately preceding period from the current value of the me series to define a new dependent
variable, y. Note that by taking first differences, you model the change in the value of the dependent
variable
So, if the original me series of x has a unit root, the change in x, xt - xt-1 = εt, is just the error term. This
means we can define yt as:
This transformed me series has a finite mean-rever ng level of and is, therefore, covariance
sta onary.
EXAMPLE: Unit root
Suppose we decide to model the capacity u liza on data. Using an AR(1) model, the results indicate
that the capacity u liza on me series probably contains a unit root and is, therefore, not covariance
sta onary. Discuss how this me series can be transformed to be covariance sta onary.
Answer:
Covariance sta onarity can o en be achieved by transforming the data using first differencing and
modeling the first-differenced me series as an autoregressive me series.
The next figure contains the first-differences of our manufacturing capacity u liza on me series for
the period 2020.1 through 2023.3. The first two columns contain the original me series. The first
differences of the original series are contained in the third column of the table, and the one-period
lagged values on the first-differences are presented in the fourth column of the table. Note that the
first differences in this example represent the change in manufacturing capacity from the preceding
period and are designated as yt and yt-1.
A er this transforma on, it is appropriate to regress the AR(1) model, yt = b0 + b1yt-1. The regression
results for the first-differenced me series are presented in the next figure, where it can be seen that
the es mated coefficient on the lag variable is sta s cally significant at 5% level of significance.
The results of the estimation of monthly revolving credit outstanding (RCO) on the one-period lagged values for
RCO from January 2020 through December 2022 are presented in the following table.
1. What type of time-series model was used to produce the regression results in the table? A(n):
A. AR model.
B. heteroskedasticity (H) model.
C. trend model with a drift.
2. An approach that may work in the case of modeling a time series that has a unit root is to:
A. use an ARCH model.
B. use a trend model.
C. model the irst differences of the time series.
3. Which of the following will always have a inite mean-reverting level?
A. A covariance-stationary time series.
B. A random-walk-with-drift time series.
C. A time series with unit root.
4. Which of the following statements is most accurate? A random walk process:
A. is nonstationary.
B. has a inite mean-reverting level.
C. can be appropriately it as an AR(1) model.
5. Which of the following is not correct about the Dickey-Fuller unit root test for nonstationarity?
A. The null hypothesis is that the time series has a unit root.
B. A hypothesis test is conducted using critical values computed by Dickey and Fuller in place of
conventional t-test values.
C. If the test statistic is signi icant, we conclude that the times series is nonstationary.
Seasonality in a me-series is a pa ern that tends to repeat from year to year. One example is monthly
sales data for a retailer. Given that sales data normally vary according to the me of year, we might
expect this month’s sales (xt) to be related to sales for the same month last year (xt-12).
When seasonality is present, modeling the associated me series data would be misspecified unless the
AR model incorporates the effects of the seasonality.
You are interested in predic ng occupancy levels for a resort hotel chain and have obtained the
chain’s quarterly occupancy levels for the most recent 40 quarters (10 years). You decide to model
the quarterly occupancy me-series using the AR(1) model:
Determine whether seasonality exists using the results presented in the following example.
Answer:
The bo om part of the table contains the residual autocorrela ons for the first four lags of the me
series. What stands out is the rela vely large autocorrela on and t-sta s c for the fourth lag. With
39 observa ons and two parameters, (b0 and b1), there are 37 degrees of freedom. At a significance
level of 5%, the cri cal t-value is 2.026.
The t-sta s cs indicate that none of the first three lagged autocorrela ons is significantly different
from zero. However, the t-sta s c at Lag 4 is 5.4460, which means that we must reject the null
hypothesis that the Lag 4 autocorrela on is zero and conclude that seasonality is present in the me-
series. Thus, we conclude that this model is misspecified and will be unreliable for forecas ng
purposes. We need to include a seasonality term to make the model more correctly specified.
PROFESSOR’S NOTE
The reason 40 quarters of data only produces 39 observa ons is because we’re analyzing the
difference from one quarter to the next; 40 data points yields 39 differences.
Correc ng for seasonality. The interpreta on of seasonality in the previous example is that occupancy
in any quarter is related to occupancy in the previous quarter and the same quarter in the previous
year. For example, fourth quarter 2022 occupancy is related to third quarter 2022 occupancy as well as
fourth quarter 2021 occupancy.
To adjust for seasonality in an AR model, an addi onal lag of the dependent variable (corresponding to
the same period in the previous year) is added to the original model as another independent variable.
For example, if quarterly data are used, the seasonal lag is 4; if monthly data are used the seasonal lag
is 12; and so on.
We con nue with our resort occupancy level example, where the significant residual correla on at
Lag 4 indicates seasonality in the quarterly me series. By tes ng the correla ons of the error terms,
it appears that occupancy levels in each quarter are related not only to the previous quarter, but also
to the corresponding quarter in the previous year. To adjust for this problem, we add a lagged value
of the dependent variable to the original model that corresponds to the seasonal pa ern.
To model the autocorrela on of the same quarters from year to year, we use an AR(1) model with a
seasonal lag: ln xt = b0 + b1(ln xt-1) + b2(ln xt-4) + εt. Note that this specifica on, the inclusion of a
seasonal lag, does not result in an AR(2) model. It results in an AR(1) model incorpora ng a seasonal
lag term.
The results obtained when this model is fit to the natural logarithm of the me series are presented
in the following. Determine whether the model is specified correctly.
Answer:
No ce in the bo om of the table that the fourth-lag residual autocorrela on has dropped
substan ally and is, in fact, no longer sta s cally significant. Also notable in these results is the
improvement in the R-square for the adjusted model (94.9%) compared to the R-square from the
original model (79.3%). The results shown in the figure indicate that, by incorpora ng a seasonal lag
term, the model is now specified correctly.
Based on the regression results from the previous example and the occupancy levels over the past
year (presented next), forecast the level of hotel occupancy for the first quarter of 2023.
Answer:
To forecast the occupancy level for the hotel chain for the first quarter of 2023 (i.e., 2023.1), the
following computa on is made:
The forecasted level of hotel occupancy for the first quarter of 2023 is 603,379, a significant increase
over the same quarter the previous year.
PROFESSOR’S NOTE
Once again, the first answer you get in this calcula on is the natural log of the occupancy
forecast. In order to turn the natural log into an occupancy figure, you use the 2nd func on
of the LN key (ex) on your BA II Plus: enter 13.3103 and press [2nd] ex = 603,378.52.
The next figure contains the results from the regression of an ARCH(1) model. The squared errors for
periods t through T are regressed on the squared errors for periods t - 1 through T - 1. (μt is the error
term for the model.) Determine whether the results indicate autoregressive condi onal
heteroskedas city (ARCH), and if so, calculate the predicted variance of the error terms in the next
period if the current period squared error is 0.5625.
Answer:
Since the p-value for the coefficient on the lagged variable indicates sta s cal significance, we can
conclude that the me series is ARCH(1). As such, the variance of the error term in the next period
can be computed as:
PROFESSOR’S NOTE
If the coefficient a1 is zero, the variance is constant from period to period. If a1 is greater
than (less than) zero, the variance increases (decreases) over me (i.e., the error terms
exhibit heteroskedas city).
LOS 2.n: Explain how me-series variables should be analyzed for nonsta onarity and/or
cointegra on before use in a linear regression.
Occasionally an analyst will run a regression using two me series (i.e., me series u lizing two different
variables). For example, using the market model to es mate the equity beta for a stock, an analyst
regresses a me series of the stock’s returns (yt) on a me series of returns for the market (xt):
No ce that now we are faced with two different me series (yt and xt), either or both of which could be
subject to nonsta onarity.
To test whether the two me series have unit roots, the analyst first runs separate DF tests with five
possible results:
1. Both me series are covariance sta onary.
2. Only the dependent variable me series is covariance sta onary.
3. Only the independent variable me series is covariance sta onary.
4. Neither me series is covariance sta onary and the two series are not cointegrated.
5. Neither me series is covariance sta onary and the two series are cointegrated.
In Scenario 1 the analyst can use linear regression, and the coefficients should be sta s cally reliable,
but regressions in Scenarios 2 and 3 will not be reliable. Whether linear regression can be used in
Scenarios 4 and 5 depends upon whether the two me series are cointegrated.
Cointegration
Cointegra on means that two me series are economically linked (related to the same macro variables)
or follow the same trend and that rela onship is not expected to change. If two me series are
cointegrated, the error term from regressing one on the other is covariance sta onary and the t-tests
are reliable. This means that Scenario 5 will produce reliable regression es mates, whereas Scenario 4
will not.
To test whether two me series are cointegrated, we regress one variable on the other using the
following model:
The residuals are tested for a unit root using the Dickey-Fuller test with cri cal t-values calculated by
Engle and Granger (i.e., the DF–EG test). If the test rejects the null hypothesis of a unit root, we say the
error terms generated by the two me series are covariance sta onary and the two series are
cointegrated. If the two series are cointegrated, we can use the regression to model their rela onship.
PROFESSOR’S NOTE
For the exam, remember that the Dickey-Fuller test does not use the standard cri cal t-values
we typically use in tes ng the sta s cal significance of individual regression coefficients. The
DF–EG test further adjusts them to test for cointegra on. As with the DF test, you do not have
to know cri cal t-values for the DF–EG test. Just remember that like the regular DF test, if the
null is rejected, we say the series (of error terms in this case) is covariance sta onary and the
two me series are cointegrated.
Figure 2.2: Can Linear Regression Be Used to Model the Relationship Between Two Time Series?
LOS 2.o: Determine an appropriate me-series model to analyze a given investment problem and
jus fy that choice.
To determine what type of model is best suited to meet your needs, follow these guidelines:
1. Determine your goal.
Are you a emp ng to model the rela onship of a variable to other variables (e.g., cointegrated
me series, cross-sec onal mul ple regression)?
Are you trying to model the variable over me (e.g., trend model)?
2. If you have decided on using a me series analysis for an individual variable, plot the values of the
variable over me and look for characteris cs that would indicate nonsta onarity, such as non-
constant variance (heteroskedas city), non-constant mean, seasonality, or structural change.
A structural change is indicated by a significant shi in the plo ed data at a point in me that seems
to divide the data into two or more dis nct pa erns. (Figure 2.3 shows a data plot that indicates a
structural shi in the me series at Point a.) In this example, you have to run two different models,
one incorpora ng the data before and one a er that date, and test whether the me series has
actually shi ed. If the me series has shi ed significantly, a single me series encompassing the
en re period (i.e., both pa erns) will likely produce unreliable results.
KEY CONCEPTS
LOS 2.a
A me series is a set of observa ons for a variable over successive periods of me. A me series model
captures the me series pa ern and allows us to make predic ons about the variable in the future.
LOS 2.b
A simple linear trend model is: yt = b0 + b1t + εt, es mated for t = 1, 2, …, T.
A log-linear trend model, ln(yt) = b0 + b1t + εt, is appropriate for exponen al data.
A plot of the data should be used to determine whether a linear or log-linear trend model should be
used.
The primary limita on of trend models is that they are not useful if the residuals exhibit serial
correla on.
LOS 2.c
A me series is covariance sta onary if its mean, variance, and covariances with lagged and leading
values do not change over me. Covariance sta onarity is a requirement for using AR models.
LOS 2.d
Autoregressive me series mul period forecasts are calculated in the same manner as those for other
regression models, but since the independent variable consists of a lagged variable, it is necessary to
calculate a one-step-ahead forecast before a two-step-ahead forecast may be calculated. The
calcula on of successive forecasts in this manner is referred to as the chain rule of forecas ng.
A one-period-ahead forecast for an AR(1) would be determined in the following manner:
A two-period-ahead forecast for an AR(1) would be determined in the following manner:
LOS 2.e
When an AR model is correctly specified, the residual terms will not exhibit serial correla on. If the
residuals possess some degree of serial correla on, the AR model that produced the residuals is not the
best model for the data being studied and the regression results will be problema c. The procedure to
test whether an AR me-series model is correctly specified involves three steps:
1. Es mate the AR model being evaluated using linear regression.
2. Calculate the autocorrela ons of the model’s residuals.
3. Test whether the autocorrela ons are significant.
LOS 2.f
A me series is mean rever ng if it tends towards its mean over me. The mean
LOS 2.g
In-sample forecasts are made within the range of data used in the es ma on. Out-of-sample forecasts
are made outside of the me period for the data used in the es ma on.
The root mean squared error criterion (RMSE) is used to compare the accuracy of autoregressive
models in forecas ng out-of-sample values. A researcher may have two autoregressive (AR) models,
both of which seem to fit the data: an AR(1) model and an AR(2) model. To determine which model will
more accurately forecast future values, we calculate the square root of the mean squared error (RMSE).
The model with the lower RMSE for the out-of-sample data will have lower forecast error and will be
expected to have be er predic ve power in the future.
LOS 2.h
Most economic and financial me series data are not sta onary. The degree of the nonsta onarity
depends on the length of the series and changes in the underlying economic environment.
LOS 2.i
A random walk me series is one for which the value in one period is equal to the value in another
period, plus a random error. A random walk process does not have a mean rever ng level and is not
sta onary.
LOS 2.j
A me series has a unit root if the coefficient on the lagged dependent variable is equal to one. A series
with a unit root is not covariance sta onary. Economic and finance me series frequently have unit
roots. Data with a unit root must be first differenced before being used in a me series model.
LOS 2.k
To determine whether a me series is covariance sta onary, we can (1) run an AR model and/or (2)
perform the Dickey-Fuller test.
LOS 2.l
Seasonality in a me series is tested by calcula ng the autocorrela ons of error terms. A sta s cally
significant lagged error term corresponding to the periodicity of the data indicates seasonality.
Seasonality can be corrected by incorpora ng the appropriate seasonal lag term in an AR model.
If a seasonal lag coefficient is appropriate and corrects the seasonality, the AR model with the seasonal
terms will have no sta s cally significant autocorrela ons of error terms.
LOS 2.m
ARCH is present if the variance of the residuals from an AR model are correlated across me. ARCH is
detected by es ma ng . If a1 is significant, ARCH exists and the variance of errors can
be predicted using: .
LOS 2.n
When working with two me series in a regression: (1) if neither me series has a unit root, then the
regression can be used; (2) if only one series has a unit root, the regression results will be invalid; (3) if
both me series have a unit root and are cointegrated, then the regression can be used; (4) if both me
series have a unit root but are not cointegrated, the regression results will be invalid.
The Dickey-Fuller test with cri cal t-values calculated by Engle and Granger is used to determine
whether two mes series are cointegrated.
LOS 2.o
The RMSE criterion is used to determine which forecas ng model will produce the most accurate
forecasts. The RMSE equals the square root of the average squared error.
Both the intercept term and the slope coefficient are significantly different from zero at the 5%
level because both t-sta s cs are greater than the cri cal t-value of 2.02. (LOS 2.a)
4. C
(LOS 2.a)
5. B A log-linear model (choice B) is most appropriate for a me series that grows at a rela vely
constant growth rate. Neither a linear trend model (choice A), nor an AR(1) model (choice C) are
appropriate in this case. (LOS 2.b)