0% found this document useful (0 votes)
68 views27 pages

Time Series

Uploaded by

studyquora.in
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views27 pages

Time Series

Uploaded by

studyquora.in
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

READING 2

TIME-SERIES ANALYSIS

EXAM FOCUS
A me series is a set of observa ons of a random variable spaced evenly through me (e.g., quarterly
sales revenue for a company over the past 60 quarters). For the exam, given a regression output,
iden fying viola ons such as heteroskedas city, nonsta onarity, serial correla on, etc., will be
important, as well as being able to calculate a predicted value given a me-series model. Know why a
log-linear model is some mes used; understand the implica ons of seasonality and how to detect and
correct it, as well as the root mean squared error (RMSE) criterion.

MODULE 2.1: LINEAR AND LOG-LINEAR TREND MODELS

LOS 2.a: Calculate and evaluate the predicted trend value for a me series, modeled as
either a linear trend or a log-linear trend, given the es mated trend coefficients.

A me series is a set of observa ons for a variable over successive periods of me (e.g., monthly stock
market returns for the past 10 years). The series has a trend if a consistent pa ern can be seen by
plo ng the data (i.e., the individual observa ons) on a graph. For example, a seasonal trend in sales
data is easily detected by plo ng the data and no ng the significant jump in sales during the same
month(s) each year.

Linear Trend Model


A linear trend is a me series pa ern that can be graphed using a straight line. A downward sloping line
indicates a nega ve trend, while an upward-sloping line indicates a posi ve trend.
The simplest form of a linear trend is represented by the following linear trend model:

Ordinary least squares (OLS) regression is used to es mate the coefficient in the trend line, which
provides the following predic on equa on:

Don’t let this model confuse you. It’s very similar to the simple linear regression model we covered
previously; only here, (t) takes on the value of the me period. For example, in period 2, the equa on
becomes:

And, likewise, in period 3:

This means increases by the value of each period.

EXAMPLE: Using a linear trend model

Suppose you are given a linear trend model with .

Calculate .

Answer:

When

When

Note that the difference between is 3.0, or the value of the trend coefficient b1.

EXAMPLE: Trend analysis

Consider hypothe cal me series data for manufacturing capacity u liza on.

Manufacturing Capacity U liza on

Applying the OLS methodology to fit the linear trend model to the data produces the results shown
here.

Time Series Regression Results for Manufacturing Capacity U liza on


Based on this informa on, predict the projected capacity u liza on for the me period involved in
the study (i.e., in-sample es mates).

Answer:

As shown in the regression output, the es mated intercept and slope parameters for our
manufacturing capacity u liza on model are , respec vely. This means that
the predic on equa on for capacity u liza on can be expressed as:

With this equa on, we can generate es mated values for capacity u liza on, , for each of the 14
quarters in the me series. For example, using the model capacity u liza on for the first quarter of
2020 is es mated at 81.914:

Note that the es mated value of capacity u liza on in that quarter (using the model) is not exactly
the same as the actual, measured capacity u liza on for that quarter (82.4). The difference between
the two is the error or residual term associated with that observa on:

Note that since the actual, measured value is greater than the predicted value of y for 2020.1, the
error term is posi ve. Had the actual, measured value been less than the predicted value, the error
term would have been nega ve.

The projec ons (i.e., values generated by the model) for all quarters are compared to the actual
values here.

Projected Versus Actual Capacity U liza on

The following graph shows visually how the predicted values compare to the actual values, which
were used to generate the regression equa on. The residuals, or error terms, are represented by the
distance between the predicted (straight) regression line and the actual data plo ed in blue. For
example, the residual for t=10 is 81.9 − 79.907=1.993.

Predicted vs. Actual Capacity U liza on


Since we u lized a linear regression model, the predicted values will by defini on fall on a straight
line. Since the raw data does not display a linear rela onship, the model will probably not do a good
job of predic ng future values.

Log-Linear Trend Models


Time series data, par cularly financial me series, o en display exponen al growth (growth with
con nuous compounding). Posi ve exponen al growth means that the random variable (i.e., the me
series) tends to increase at some constant rate of growth. If we plot the data, the observa ons will form
a convex curve. Nega ve exponen al growth means that the data tends to decrease at some constant
rate of decay, and the plo ed me series will be a concave curve.
When a series exhibits exponen al growth, it can be modeled as:

This model defines y, the dependent variable, as an exponen al func on of me, the independent
variable. Rather than try to fit the nonlinear data with a linear (straight line) regression, we take the
natural log of both sides of the equa on and arrive at the log-linear model. This is frequently used when
me series data exhibit exponen al growth.

Now that the equa on has been transformed from an exponen al to a linear func on, we can use a
linear regression technique to model the series. The use of the transformed data produces a linear
trend line with a be er fit for the data and increases the predic ve ability of the model.

EXAMPLE: Log-linear trend model

An analyst es mates a log-linear trend model using quarterly revenue data (in millions of $) from the
first quarter of 2012 to the fourth quarter of 2023 for JP Northfield, Inc.:

The results are shown in the following table.


Calculate JP Northfield’s predicted revenues in the first quarter of 2024.

Answer:

In the first quarter of 2024, t is equal to 49 because the sample has 48 observa ons.

The first answer you get in this calcula on is the natural log of the revenue forecast. In order to turn
the natural log into a revenue figure, you use the 2nd func on of the LN key (ex) on your BA II Plus:
enter 8.41 and press [2nd] ex = 4,492 million.

LOS 2.b: Describe factors that determine whether a linear or a log-linear trend should
be used with a par cular me series and evaluate limita ons of trend models.

Factors that Determine Which Model is Best


To determine if a linear or log-linear trend model should be used, the analyst should plot the data. A
linear trend model may be appropriate if the data points appear to be equally distributed above and
below the regression line. Infla on rate data can o en be modeled with a linear trend model.
If, on the other hand, the data plots with a non-linear (curved) shape, then the residuals from a linear
trend model will be persistently posi ve or nega ve for a period of me. In this case, the log-linear
model may be more suitable. In other words, when the residuals from a linear trend model are serially
correlated, a log-linear trend model may be more appropriate. By taking the log of the y variable, a
regression line can be er fit the data. Financial data (e.g., stock indices and stock prices) and company
sales data are o en best modeled with log-linear models.
Figure 2.1 shows a me series that is best modeled with a log-linear trend model rather than a linear
trend model.

Figure 2.1: Linear vs. Log-Linear Trend Models

The le panel is a plot of data that exhibits exponen al growth along with a linear trend line. The panel
on the right is a plot of the natural logs of the original data and a representa ve log-linear trend line.
The log-linear model fits the transformed data be er than the linear trend model and, therefore, yields
more accurate forecasts.
The bo om line is that when a variable grows at a constant rate, a log-linear model is most appropriate.
When the variable increases over me by a constant amount, a linear trend model is most appropriate.
Limitations of Trend Models
Recall that one of the assump ons underlying linear regression is that the residuals are uncorrelated
with each other. A viola on of this assump on is referred to as autocorrela on. For AR models (where
the lagged dependent variable is an independent variable), presence of serial correla on in residuals
indicates that the model is misspecified. This is a significant limita on, as it means that the model is not
appropriate for the me series and that we should not use it to predict future values.
In the preceding discussion, we suggested that a log-linear trend model would be be er than a linear
trend model when the variable exhibits a constant growth rate. However, it may be the case that even a
log-linear model is not appropriate in the presence of serial correla on. In this case, we will want to
turn to an autoregressive model.
Recall from the previous topic review that the Durbin-Watson sta s c (DW) is used to detect
autocorrela on. For a me series model without serial correla on DW should be approximately equal
to 2.0. A DW significantly different from 2.0 suggests that the residual terms are correlated.

MODULE QUIZ 2.1

Use the following information to answer Questions 1 through 4.

Consider the results of the regression of monthly real estate loans (RE) in billions of dollars by commercial
banks over the period January 2020 through September 2023 in the following table:

Time Series Regression Results for Real Estate Loans

1. The regression of real estate loans against time is a(n):


A. trend model.
B. AR model.
C. ARCH model.
2. The results of the estimation indicate an:
A. upward trend.
B. AR(2) model.
C. ARCH system.
3. Are the intercept and slope coef icient signi icantly different from zero at the 5% level of signi icance?
A. Both are statistically signi icant.
B. One is, but the other is not.
C. Neither of them is statistically signi icant.
4. The forecasted value of real estate loans for October 2023 is closest to:
A. $1,733.764 billion.
B. $1,745.990 billion.
C. $1,758.225 billion.
5. An analyst has determined that monthly sport utility vehicle (SUV) sales in the United States have been
increasing over the last 10 years, but the growth rate over that period has been relatively constant. Which
model is most appropriate to predict future SUV sales?
A. SUVsalest = b0 + b1(t) + et.
B. lnSUVsalest = b0 + b1(t) + et.
C. lnSUVsalest = b0 + b1(SUVsalest-1) + et.

MODULE 2.2: AUTOREGRESSIVE (AR) MODELS


Video covering
this content is
LOS 2.c: Explain the requirement for a me series to be covariance sta onary and available online.
describe the significance of a series that is not sta onary.

When the dependent variable is regressed against one or more lagged values of itself, the resultant
model is called as an autoregressive model (AR). For example, the sales for a firm could be regressed
against the sales for the firm in the previous month.
Consider:

In an autoregressive me series, past values of a variable are used to predict the current (and hence
future) value of the variable.
Sta s cal inferences based on ordinary least squares (OLS) es mates for an AR me series model may
be invalid unless the me series being modeled is covariance sta onary.
A me series is covariance sta onary if it sa sfies the following three condi ons:
1. Constant and finite expected value. The expected value of the me series is constant over me.
(Later, we will refer to this value as the mean-rever ng level.)
2. Constant and finite variance. The me series’ vola lity around its mean (i.e., the distribu on of the
individual observa ons around the mean) does not change over me.
3. Constant and finite covariance between values at any given lag. The covariance of the me series
with leading or lagged values of itself is constant.

LOS 2.d: Describe the structure of an autoregressive (AR) model of order p and
calculate one- and two-period-ahead forecasts given the es mated coefficients.

The following model illustrates how variable x would be regressed on itself with a lag of one and two
periods:

Such a model is referred to as a second-order autoregressive model, or an AR(2) model. In general, an


AR model of order p, AR(p), is expressed as:

where p indicates the number of lagged values that the AR model will include as independent
variables.

Forecasting With an Autoregressive Model


Autoregressive me series model forecasts are calculated in the same manner as those for other
regression models, but since the independent variable is a lagged value of the dependent variable, it is
necessary to calculate a one-step-ahead forecast before a two-step-ahead forecast can be calculated.
The calcula on of successive forecasts in this manner is referred to as the chain rule of forecas ng.
A one-period-ahead forecast for an AR(1) model is determined in the following manner:

Likewise, a two-step-ahead forecast for an AR(1) model is calculated as:

Note that the ^ symbol above the variables in the equa ons indicates that the inputs used in mul -
period forecasts are actually forecasts (es mates) themselves. This implies that mul -period forecasts
are more uncertain than single-period forecasts. For example, for a two-step-ahead forecast, there is
the usual uncertainty associated with forecas ng xt+1 using xt, plus the addi onal uncertainty of
forecas ng xt+2 using the forecasted value for xt+1.

EXAMPLE: Forecas ng

Suppose that an AR(1) model has been es mated and has produced the following predic on
equa on: xt = 1.2 + 0.45xt-1. Calculate a two-step-ahead forecast if the current value of x is 5.0.

Answer:

One-step-ahead forecast: .

Two-step-ahead forecast: .

LOS 2.e: Explain how autocorrela ons of the residuals can be used to test whether the
autoregressive model fits the me series.

Autocorrelation & Model Fit


When an AR model is correctly specified, the residual terms will not exhibit serial correla on. Serial
correla on (or autocorrela on) means the error terms are posi vely or nega vely correlated. When the
error terms are correlated, standard errors are unreliable and t-tests of individual coefficients can
incorrectly show sta s cal significance or insignificance.
If the residuals have significant autocorrela on, the AR model that produced the residuals is not the
best model for the me series being analyzed. The procedure to test whether an AR me series model
is correctly specified involves three steps:
Step 1: Es mate the AR model being evaluated using linear regression:
Start with a first-order AR model [i.e., AR(1)] using xt = b0 + b1xt-1 + εt.
Step 2: Calculate the autocorrela ons of the model’s residuals (i.e., the level of correla on between the
forecast errors from one period to the next).
Step 3: Test whether the autocorrela ons are significantly different from zero:
If the model is correctly specified, none of the autocorrela ons will be sta s cally significant. To
test for significance, a t-test is used to test the hypothesis that the correla ons of the residuals
are zero. The t-sta s c is the es mated autocorrela on divided by the standard error. The
standard error is is the number of observa ons, so the test sta s c for each
autocorrela on is degrees of freedom and is the correla on of error
term t with the kth lagged error term.

PROFESSOR’S NOTE
The Durbin-Watson test that we used with trend models is not appropriate for tes ng for
serial correla on of the error terms in an autoregressive model. Use this t-test instead.

EXAMPLE: Tes ng an AR model for proper specifica on

The correla ons of the error terms from the es ma on of an AR(1) model using a sample with 102
observa ons are presented in the following figure. Determine whether the model is correctly
specified.

Autocorrela on Analysis

Answer:

In this example, the standard error is or 0.099. The t-sta s c for Lag 2 is then computed as
0.0843368 / 0.099 = 0.8518.

The cri cal two-tail t-value at the 5% significance level and 100 degrees of freedom is 1.98. The t-
sta s cs indicate that none of the autocorrela ons of the residuals in the previous figure is
sta s cally different from zero because their absolute values are less than 1.98. Thus, there is
sufficient reason to believe that the error terms from the AR(1) model are not serially correlated.

If the t-tests indicate that any of the correla ons computed in Step 2 are sta s cally significant (i.e., t
≥ 1.98), the AR model is not specified correctly. Addi onal lags are included in the model and the
correla ons of the residuals (error terms) are checked again. This procedure will be followed un l all
autocorrela ons are insignificant.

LOS 2.f: Explain mean reversion and calculate a mean-rever ng level.

A me series exhibits mean reversion if it has a tendency to move toward its mean. In other words, the
me series has a tendency to decline when the current value is above the mean and rise when the
current value is below the mean. If a me series is at its mean-rever ng level, the model predicts that
the next value of the me series will be the same as its current value (i.e., when a me series is
at its mean-rever ng level).
EXAMPLE: Mean-rever ng me series

Calculate the mean-rever ng level for the manufacturing capacity u liza on me series using the
following regression results:

Time Series Regression Results for Manufacturing Capacity U liza on

Answer:

This means that if the current level of manufacturing capacity u liza on is above 67.16, it is expected
to fall in the next period, and if manufacturing capacity u liza on is below 67.16 in the current
period, it is expected to rise in the next period.

All covariance sta onary me series have a finite mean-rever ng level. An AR(1) me series will have a
finite mean-rever ng level when the absolute value of the lag coefficient is less than 1 (i.e., |b1| < 1).

LOS 2.g: Contrast in-sample and out-of-sample forecasts and compare the forecas ng
accuracy of different me-series models based on the root mean squared error
criterion.

In-sample forecasts are within the range of data (i.e., me period) used to es mate the model,
which for a me series is known as the sample or test period. In-sample forecast errors are ,
where t is an observa on within the sample period. In other words, we are comparing how accurate our
model is in forecas ng the actual data we used to develop the model. The Predicted vs. Actual Capacity
U liza on figure in our Trend Analysis example shows an example of values predicted by the model
compared to the values used to generate the model.
Out-of-sample forecasts are made outside of the sample period. In other words, we compare how
accurate a model is in forecas ng the y variable value for a me period outside the period used to
develop the model. Out-of-sample forecasts are important because they provide a test of whether the
model adequately describes the me series and whether it has relevance (i.e., predic ve power) in the
real world. Nonetheless, an analyst should be aware that most published research employs in-sample
forecasts only.
The root mean squared error (RMSE) criterion is used to compare the accuracy of autoregressive
models in forecas ng out-of-sample values. For example, a researcher may have two autoregressive
(AR) models: an AR(1) model and an AR(2) model. To determine which model will more accurately
forecast future values, we calculate the RMSE (the square root of the average of the squared errors) for
the out-of-sample data. Note that the model with the lowest RMSE for in-sample data may not be the
model with the lowest RMSE for out-of-sample data.
For example, imagine that we have 60 months of historical unemployment data. We es mate both
models over the first 36 of 60 months. To determine which model will produce be er (i.e., more
accurate) forecasts, we then forecast the values for the last 24 of 60 months of historical data. Using the
actual values for the last 24 months as well as the values predicted by the models, we can calculate the
RMSE for each model.
The model with the lower RMSE for the out-of-sample data will have lower forecast error and will be
expected to have be er predic ve power in the future.
In addi on to examining the RMSE criteria for a model, we will also want to examine the stability of
regression coefficients, which we discuss in the following.

LOS 2.h: Explain the instability of coefficients of me-series models.

Financial and economic me series inherently exhibit some form of instability or nonsta onarity. This is
because financial and economic condi ons are dynamic, and the es mated regression coefficients in
one period may be quite different from those es mated during another period.
Models es mated with shorter me series are usually more stable than those with longer me series
because a longer sample period increases the chance that the underlying economic process has
changed. Thus, there is a tradeoff between the increased sta s cal reliability when using longer me
periods and the increased stability of the es mates when using shorter periods.
The primary concern when selec ng a me series sample period is the underlying economic processes.
Have there been regulatory changes? Has there been a drama c change in the underlying economic
environment?
If the answer is yes, then the historical data may not provide a reliable model. Merely examining the
significance of the autocorrela on of the residuals will not indicate whether the model is valid. We must
also examine whether the data is covariance sta onary.

MODULE QUIZ 2.2


1. Is the time series shown in the following igure likely to be covariance stationary?

A. X is not covariance stationary due to homoskedasticity.


B. X is not covariance stationary due to non-constant mean.
C. X is covariance stationary.
2. Given the prediction equation: what is the forecast value of xt+2 if xt-1 is 16.5?
A. 64.28.
B. 117.49.
C. 210.61.
3. When evaluating a time series model’s real-world ability to forecast, we would have the most con idence in a
model with small:
A. in-sample forecast error.
B. out-of-sample forecast error.
C. residuals.

MODULE 2.3: RANDOM WALKS AND UNIT ROOTS


Video covering
this content is
LOS 2.i: Describe characteris cs of random walk processes and contrast them to available online.
covariance sta onary processes.

Random walk. If a me series follows a random walk process, the predicted value of the series (i.e., the
value of the dependent variable) in one period is equal to the value of the series in the previous period
plus a random error term.
A me series that follows a simple random walk process is described in equa on form as xt = xt-1 + εt,
where the best forecast of xt is xt-1 and:
1. E(εt) = 0: The expected value of each error term is zero.
2. E(εt2) = σ2: The variance of the error terms is constant.
3. E(εiεj) = 0; if i ≠ j: There is no serial correla on in the error terms.
Random walk with a dri . If a me series follows a random walk with a dri , the intercept term is not
equal to zero. That is, in addi on to a random error term, the me series is expected to increase or
decrease by a constant amount each period. A random walk with a dri can be described as:

Covariance sta onarity. Neither a random walk nor a random walk with a dri exhibits covariance
sta onarity. To show this, let’s start by expressing a random walk as:

In either case (with or without a dri ), the mean-rever ng level is (the division of any
number by zero is undefined), and as we stated earlier, a me series must have a finite mean-rever ng
level to be covariance sta onary. Thus, a random walk, with or without a dri , is not covariance
sta onary, and exhibits what is known as a unit root (b1 = 1). For a me series that is not covariance
sta onary, the least squares regression procedure that we have been using to es mate an AR(1) model
will not work without transforming the data. We discuss unit roots and how they are handled in the
next sec on.

LOS 2.j: Describe implica ons of unit roots for me-series analysis, explain when unit roots are likely to
occur and how to test for them, and demonstrate how a me series with a unit root can be transformed
so it can be analyzed with an AR model.

LOS 2.k: Describe the steps of the unit root test for nonsta onarity and explain the rela on of the test
to autoregressive me-series models.

As we discussed in the previous LOS, if the coefficient on the lag variable is 1, the series is not
covariance sta onary. If the value of the lag coefficient is equal to one, the me series is said to have a
unit root and will follow a random walk process. Since a me series that follows a random walk is not
covariance sta onary, modeling such a me series in an AR model can lead to incorrect inferences.
Unit Root Testing for Nonstationarity
To determine whether a me series is covariance sta onary, we can (1) run an AR model and examine
autocorrela ons, or (2) perform the Dickey-Fuller test.
In the first method, an AR model is es mated and the sta s cal significance of the autocorrela ons at
various lags is examined. A sta onary process will usually have residual autocorrela ons insignificantly
different from zero at all lags or residual autocorrela ons that decay to zero as the number of lags
increases.
A more defini ve test for unit root is the Dickey-Fuller test. For sta s cal reasons, you cannot directly
test whether the coefficient on the independent variable in an AR me series is equal to 1. To
compensate, Dickey and Fuller created a rather ingenious test for a unit root. Remember, if an AR(1)
model has a coefficient of 1, it has a unit root and no finite mean rever ng level (i.e., it is not covariance
sta onary). Dickey and Fuller (DF) transform the AR(1) model to run a simple regression. To transform
the model, they (1) start with the basic form of the AR(1) model and (2) subtract xt-1 from both sides:

Then, rather than directly tes ng whether the original coefficient is different from 1, they test whether
the new, transformed coefficient (b1 − 1) is different from zero using a modified t-test. If (b1 − 1) is not
significantly different from zero, they say that b1 must be equal to 1.0 and, therefore, the series must
have a unit root.

PROFESSOR’S NOTE
In their actual test, Dickey and Fuller use the variable g, which equals (b1 − 1). The null
hypothesis is g = 0 (i.e., the me series has a unit root). For the exam, understand how the
test is conducted and be able to interpret its results. For example, if on the exam you are told
the null (g = 0) cannot be rejected, your answer is that the me series has a unit root. If the
null is rejected, the me series does not have a unit root.

First Differencing
If we believe a me series is a random walk (i.e., has a unit root), we can transform the data to a
covariance sta onary me series using a procedure called first differencing. The first differencing
process involves subtrac ng the value of the me series (i.e., the dependent variable) in the
immediately preceding period from the current value of the me series to define a new dependent
variable, y. Note that by taking first differences, you model the change in the value of the dependent
variable
So, if the original me series of x has a unit root, the change in x, xt - xt-1 = εt, is just the error term. This
means we can define yt as:

Then, sta ng y in the form of an AR(1) model:

This transformed me series has a finite mean-rever ng level of and is, therefore, covariance
sta onary.
EXAMPLE: Unit root

Suppose we decide to model the capacity u liza on data. Using an AR(1) model, the results indicate
that the capacity u liza on me series probably contains a unit root and is, therefore, not covariance
sta onary. Discuss how this me series can be transformed to be covariance sta onary.

Answer:

Covariance sta onarity can o en be achieved by transforming the data using first differencing and
modeling the first-differenced me series as an autoregressive me series.

EXAMPLE: First differencing

The next figure contains the first-differences of our manufacturing capacity u liza on me series for
the period 2020.1 through 2023.3. The first two columns contain the original me series. The first
differences of the original series are contained in the third column of the table, and the one-period
lagged values on the first-differences are presented in the fourth column of the table. Note that the
first differences in this example represent the change in manufacturing capacity from the preceding
period and are designated as yt and yt-1.

First-Differenced Manufacturing Capacity U liza on Data

A er this transforma on, it is appropriate to regress the AR(1) model, yt = b0 + b1yt-1. The regression
results for the first-differenced me series are presented in the next figure, where it can be seen that
the es mated coefficient on the lag variable is sta s cally significant at 5% level of significance.

Regression Output for First-Differenced Manufacturing Capacity


MODULE QUIZ 2.3

Use the following information to answer Questions 1 and 2.

The results of the estimation of monthly revolving credit outstanding (RCO) on the one-period lagged values for
RCO from January 2020 through December 2022 are presented in the following table.

Regression Results for Outstanding Revolving Credit Study

1. What type of time-series model was used to produce the regression results in the table? A(n):
A. AR model.
B. heteroskedasticity (H) model.
C. trend model with a drift.
2. An approach that may work in the case of modeling a time series that has a unit root is to:
A. use an ARCH model.
B. use a trend model.
C. model the irst differences of the time series.
3. Which of the following will always have a inite mean-reverting level?
A. A covariance-stationary time series.
B. A random-walk-with-drift time series.
C. A time series with unit root.
4. Which of the following statements is most accurate? A random walk process:
A. is nonstationary.
B. has a inite mean-reverting level.
C. can be appropriately it as an AR(1) model.
5. Which of the following is not correct about the Dickey-Fuller unit root test for nonstationarity?
A. The null hypothesis is that the time series has a unit root.
B. A hypothesis test is conducted using critical values computed by Dickey and Fuller in place of
conventional t-test values.
C. If the test statistic is signi icant, we conclude that the times series is nonstationary.

MODULE 2.4: SEASONALITY


Video covering
this content is
LOS 2.l: Explain how to test and correct for seasonality in a me-series model and available online.
calculate and interpret a forecasted value using an AR model with a seasonal lag.

Seasonality in a me-series is a pa ern that tends to repeat from year to year. One example is monthly
sales data for a retailer. Given that sales data normally vary according to the me of year, we might
expect this month’s sales (xt) to be related to sales for the same month last year (xt-12).
When seasonality is present, modeling the associated me series data would be misspecified unless the
AR model incorporates the effects of the seasonality.

EXAMPLE: Detec ng seasonality

You are interested in predic ng occupancy levels for a resort hotel chain and have obtained the
chain’s quarterly occupancy levels for the most recent 40 quarters (10 years). You decide to model
the quarterly occupancy me-series using the AR(1) model:

Determine whether seasonality exists using the results presented in the following example.

Autoregression Output for Log-Quarterly Hotel Occupancy

Answer:

The bo om part of the table contains the residual autocorrela ons for the first four lags of the me
series. What stands out is the rela vely large autocorrela on and t-sta s c for the fourth lag. With
39 observa ons and two parameters, (b0 and b1), there are 37 degrees of freedom. At a significance
level of 5%, the cri cal t-value is 2.026.

The t-sta s cs indicate that none of the first three lagged autocorrela ons is significantly different
from zero. However, the t-sta s c at Lag 4 is 5.4460, which means that we must reject the null
hypothesis that the Lag 4 autocorrela on is zero and conclude that seasonality is present in the me-
series. Thus, we conclude that this model is misspecified and will be unreliable for forecas ng
purposes. We need to include a seasonality term to make the model more correctly specified.
PROFESSOR’S NOTE
The reason 40 quarters of data only produces 39 observa ons is because we’re analyzing the
difference from one quarter to the next; 40 data points yields 39 differences.

Correc ng for seasonality. The interpreta on of seasonality in the previous example is that occupancy
in any quarter is related to occupancy in the previous quarter and the same quarter in the previous
year. For example, fourth quarter 2022 occupancy is related to third quarter 2022 occupancy as well as
fourth quarter 2021 occupancy.
To adjust for seasonality in an AR model, an addi onal lag of the dependent variable (corresponding to
the same period in the previous year) is added to the original model as another independent variable.
For example, if quarterly data are used, the seasonal lag is 4; if monthly data are used the seasonal lag
is 12; and so on.

EXAMPLE: Correc ng for seasonality in a me-series model

We con nue with our resort occupancy level example, where the significant residual correla on at
Lag 4 indicates seasonality in the quarterly me series. By tes ng the correla ons of the error terms,
it appears that occupancy levels in each quarter are related not only to the previous quarter, but also
to the corresponding quarter in the previous year. To adjust for this problem, we add a lagged value
of the dependent variable to the original model that corresponds to the seasonal pa ern.

To model the autocorrela on of the same quarters from year to year, we use an AR(1) model with a
seasonal lag: ln xt = b0 + b1(ln xt-1) + b2(ln xt-4) + εt. Note that this specifica on, the inclusion of a
seasonal lag, does not result in an AR(2) model. It results in an AR(1) model incorpora ng a seasonal
lag term.

The results obtained when this model is fit to the natural logarithm of the me series are presented
in the following. Determine whether the model is specified correctly.

Answer:

No ce in the bo om of the table that the fourth-lag residual autocorrela on has dropped
substan ally and is, in fact, no longer sta s cally significant. Also notable in these results is the
improvement in the R-square for the adjusted model (94.9%) compared to the R-square from the
original model (79.3%). The results shown in the figure indicate that, by incorpora ng a seasonal lag
term, the model is now specified correctly.

Forecasting Using an AR Model with a Seasonal Lag


EXAMPLE: Forecas ng with an autoregressive model

Based on the regression results from the previous example and the occupancy levels over the past
year (presented next), forecast the level of hotel occupancy for the first quarter of 2023.

Quarterly Hotel Occupancy Levels

Answer:

We express the seasonally adjusted forecas ng equa on as:

where xt is the occupancy level for the tth quarter.

To forecast the occupancy level for the hotel chain for the first quarter of 2023 (i.e., 2023.1), the
following computa on is made:

The forecasted level of hotel occupancy for the first quarter of 2023 is 603,379, a significant increase
over the same quarter the previous year.

PROFESSOR’S NOTE
Once again, the first answer you get in this calcula on is the natural log of the occupancy
forecast. In order to turn the natural log into an occupancy figure, you use the 2nd func on
of the LN key (ex) on your BA II Plus: enter 13.3103 and press [2nd] ex = 603,378.52.

MODULE QUIZ 2.4

Use the following information to answer Questions 1 through 3.

Regression Results for Monthly Cash Flow Study


1. The number of observations in the time series used to estimate the model represented in the table is closest
to:
A. 16.
B. 50.
C. 250.
2. Based on the information given, what type of model was used?
A. AR(1).
B. AR(2).
C. AR(12).
3. At a 5% level of signi icance, does the information indicate the presence of seasonality?
A. No, because the lag-12 autocorrelation of the residual is not signi icant.
B. Yes, because the lag-12 autocorrelation of the residual is signi icantly different than one.
C. There is not enough information provided; the autocorrelation for the irst lag is also needed to detect
seasonality.
4. A time-series model that uses quarterly data exhibits seasonality if the fourth autocorrelation of the error
term:
A. differs signi icantly from 0.
B. does not differ signi icantly from 0.
C. does not differ signi icantly from the irst autocorrelation of the error term.
5. In an autoregressive time-series model, seasonality may be corrected by:
A. excluding one or more of the lagged variables until the seasonality disappears.
B. transforming the time series using irst-differencing.
C. adding an additional variable that re lects an appropriate lag of the time series.
6. Which of the following AR models is most appropriate for a time series with annual seasonality using
quarterly observations?
A. b1xt-1 + b2xt-12 + εt.
B. b0 + b1xt-1 + b2xt-4 + εt.
C. b0 + b1xt-4 + b2xt-12 + εt.

MODULE 2.5: ARCH AND MULTIPLE TIME SERIES


Video covering
this content is
LOS 2.m: Explain autoregressive condi onal heteroskedas city (ARCH) and describe available online.
how ARCH models can be applied to predict the variance of a me series.

When examining a single me series, such as an AR model, autoregressive condi onal


heteroskedas city (ARCH) exists if the variance of the residuals in one period is dependent on the
variance of the residuals in a previous period. When this condi on exists, the standard errors of the
regression coefficients in AR models and the hypothesis tests of these coefficients are invalid.

Using ARCH Models


An ARCH model is used to test for autoregressive condi onal heteroskedas city. Within the ARCH
framework, an ARCH(1) me series is one for which the variance of the residuals in one period is
dependent on (i.e., a func on of) the variance of the residuals in the preceding period. To test whether
a me series is ARCH(1), the squared residuals from an es mated me-series model, are regressed
on the first lag of the squared residuals .
The ARCH(1) regression model is expressed as:

where a0 is the constant and μt is an error term.


If the coefficient, a1, is sta s cally different from zero, the me series is ARCH(1).
If a me-series model has been determined to contain ARCH errors, regression procedures that correct
for heteroskedas city, such as generalized least squares, must be used in order to develop a predic ve
model. Otherwise, the standard errors of the model’s coefficients will be incorrect, leading to invalid
conclusions.

Predicting the Variance of a Time Series


However, if a me series has ARCH errors, an ARCH model can be used to predict the variance of the
residuals in future periods. For example, if the data exhibit an ARCH(1) pa ern, the ARCH(1) model can
be used in period t to predict the variance of the residuals in period t + 1:

Example: ARCH(1) me series

The next figure contains the results from the regression of an ARCH(1) model. The squared errors for
periods t through T are regressed on the squared errors for periods t - 1 through T - 1. (μt is the error
term for the model.) Determine whether the results indicate autoregressive condi onal
heteroskedas city (ARCH), and if so, calculate the predicted variance of the error terms in the next
period if the current period squared error is 0.5625.

ARCH (1) Regression Results

Answer:

Since the p-value for the coefficient on the lagged variable indicates sta s cal significance, we can
conclude that the me series is ARCH(1). As such, the variance of the error term in the next period
can be computed as:

PROFESSOR’S NOTE
If the coefficient a1 is zero, the variance is constant from period to period. If a1 is greater
than (less than) zero, the variance increases (decreases) over me (i.e., the error terms
exhibit heteroskedas city).

LOS 2.n: Explain how me-series variables should be analyzed for nonsta onarity and/or
cointegra on before use in a linear regression.

Occasionally an analyst will run a regression using two me series (i.e., me series u lizing two different
variables). For example, using the market model to es mate the equity beta for a stock, an analyst
regresses a me series of the stock’s returns (yt) on a me series of returns for the market (xt):

No ce that now we are faced with two different me series (yt and xt), either or both of which could be
subject to nonsta onarity.
To test whether the two me series have unit roots, the analyst first runs separate DF tests with five
possible results:
1. Both me series are covariance sta onary.
2. Only the dependent variable me series is covariance sta onary.
3. Only the independent variable me series is covariance sta onary.
4. Neither me series is covariance sta onary and the two series are not cointegrated.
5. Neither me series is covariance sta onary and the two series are cointegrated.
In Scenario 1 the analyst can use linear regression, and the coefficients should be sta s cally reliable,
but regressions in Scenarios 2 and 3 will not be reliable. Whether linear regression can be used in
Scenarios 4 and 5 depends upon whether the two me series are cointegrated.

Cointegration
Cointegra on means that two me series are economically linked (related to the same macro variables)
or follow the same trend and that rela onship is not expected to change. If two me series are
cointegrated, the error term from regressing one on the other is covariance sta onary and the t-tests
are reliable. This means that Scenario 5 will produce reliable regression es mates, whereas Scenario 4
will not.
To test whether two me series are cointegrated, we regress one variable on the other using the
following model:

The residuals are tested for a unit root using the Dickey-Fuller test with cri cal t-values calculated by
Engle and Granger (i.e., the DF–EG test). If the test rejects the null hypothesis of a unit root, we say the
error terms generated by the two me series are covariance sta onary and the two series are
cointegrated. If the two series are cointegrated, we can use the regression to model their rela onship.

PROFESSOR’S NOTE
For the exam, remember that the Dickey-Fuller test does not use the standard cri cal t-values
we typically use in tes ng the sta s cal significance of individual regression coefficients. The
DF–EG test further adjusts them to test for cointegra on. As with the DF test, you do not have
to know cri cal t-values for the DF–EG test. Just remember that like the regular DF test, if the
null is rejected, we say the series (of error terms in this case) is covariance sta onary and the
two me series are cointegrated.

Figure 2.2: Can Linear Regression Be Used to Model the Relationship Between Two Time Series?
LOS 2.o: Determine an appropriate me-series model to analyze a given investment problem and
jus fy that choice.

To determine what type of model is best suited to meet your needs, follow these guidelines:
1. Determine your goal.
Are you a emp ng to model the rela onship of a variable to other variables (e.g., cointegrated
me series, cross-sec onal mul ple regression)?
Are you trying to model the variable over me (e.g., trend model)?
2. If you have decided on using a me series analysis for an individual variable, plot the values of the
variable over me and look for characteris cs that would indicate nonsta onarity, such as non-
constant variance (heteroskedas city), non-constant mean, seasonality, or structural change.
A structural change is indicated by a significant shi in the plo ed data at a point in me that seems
to divide the data into two or more dis nct pa erns. (Figure 2.3 shows a data plot that indicates a
structural shi in the me series at Point a.) In this example, you have to run two different models,
one incorpora ng the data before and one a er that date, and test whether the me series has
actually shi ed. If the me series has shi ed significantly, a single me series encompassing the
en re period (i.e., both pa erns) will likely produce unreliable results.

Figure 2.3: A Structural Shift in a Time Series

3. If there is no seasonality or structural shi , use a trend model.


If the data plot on a straight line with an upward or downward slope, use a linear trend model.
If the data plot in a curve, use a log-linear trend model.
4. Run the trend analysis, compute the residuals, and test for serial correla on using the Durbin-
Watson test.
If you detect no serial correla on, you can use the model.
If you detect serial correla on, you must use another model (e.g., AR).
5. If the data has serial correla on, reexamine the data for sta onarity before running an AR model. If it
is not sta onary, treat the data for use in an AR model as follows:
If the data has a linear trend, first-difference the data.
If the data has an exponen al trend, first-difference the natural log of the data.
If there is a structural shi in the data, run two separate models as discussed previously.
If the data has a seasonal component, incorporate the seasonality in the AR model as discussed in
the following.
6. A er first-differencing in 5 previously, if the series is covariance sta onary, run an AR(1) model and
test for serial correla on and seasonality.
If there is no remaining serial correla on, you can use the model.
If you s ll detect serial correla on, incorporate lagged values of the variable (possibly including
one for seasonality—e.g., for monthly data, add the 12th lag of the me series) into the AR model
un l you have removed (i.e., modeled) any serial correla on.
7. Test for ARCH. Regress the square of the residuals on squares of lagged values of the residuals and
test whether the resul ng coefficient is significantly different from zero.
If the coefficient is not significantly different from zero, you can use the model.
If the coefficient is significantly different from zero, ARCH is present. Correct using generalized
least squares.
8. If you have developed two sta s cally reliable models and want to determine which is be er at
forecas ng, calculate their out-of-sample RMSE.

MODULE QUIZ 2.5


1. Which of the following is true of modeling a time series that contains two or more distinct periods where the
data is fundamentally different?
A. The optimal data sample period for estimating the time-series model can be calculated mathematically.
B. To most accurately estimate the time-series model, the entire available time series data set should be
used as the sample period.
C. We have to it two different models for each of the two distinct periods.
2. Which of the following indicates the presence of Autoregressive Conditional Heteroskedasticity (ARCH) in a
time-series model?
A. The autocorrelations of the error terms are zero at all lags.
B. The variance of the current error depends on the variance of lagged errors.
C. The error term shows signi icant serial correlation at lag 1.
3. Linear regression is least appropriate for modeling the relationship between two time series when:
A. neither series has a unit root.
B. one of the time series has a unit root, the other does not.
C. both series have a unit root, and the time series are cointegrated.

KEY CONCEPTS

LOS 2.a
A me series is a set of observa ons for a variable over successive periods of me. A me series model
captures the me series pa ern and allows us to make predic ons about the variable in the future.

LOS 2.b
A simple linear trend model is: yt = b0 + b1t + εt, es mated for t = 1, 2, …, T.
A log-linear trend model, ln(yt) = b0 + b1t + εt, is appropriate for exponen al data.
A plot of the data should be used to determine whether a linear or log-linear trend model should be
used.
The primary limita on of trend models is that they are not useful if the residuals exhibit serial
correla on.

LOS 2.c
A me series is covariance sta onary if its mean, variance, and covariances with lagged and leading
values do not change over me. Covariance sta onarity is a requirement for using AR models.

LOS 2.d
Autoregressive me series mul period forecasts are calculated in the same manner as those for other
regression models, but since the independent variable consists of a lagged variable, it is necessary to
calculate a one-step-ahead forecast before a two-step-ahead forecast may be calculated. The
calcula on of successive forecasts in this manner is referred to as the chain rule of forecas ng.
A one-period-ahead forecast for an AR(1) would be determined in the following manner:
A two-period-ahead forecast for an AR(1) would be determined in the following manner:

LOS 2.e
When an AR model is correctly specified, the residual terms will not exhibit serial correla on. If the
residuals possess some degree of serial correla on, the AR model that produced the residuals is not the
best model for the data being studied and the regression results will be problema c. The procedure to
test whether an AR me-series model is correctly specified involves three steps:
1. Es mate the AR model being evaluated using linear regression.
2. Calculate the autocorrela ons of the model’s residuals.
3. Test whether the autocorrela ons are significant.

LOS 2.f
A me series is mean rever ng if it tends towards its mean over me. The mean

rever ng level for an AR(1) model is .

LOS 2.g
In-sample forecasts are made within the range of data used in the es ma on. Out-of-sample forecasts
are made outside of the me period for the data used in the es ma on.
The root mean squared error criterion (RMSE) is used to compare the accuracy of autoregressive
models in forecas ng out-of-sample values. A researcher may have two autoregressive (AR) models,
both of which seem to fit the data: an AR(1) model and an AR(2) model. To determine which model will
more accurately forecast future values, we calculate the square root of the mean squared error (RMSE).
The model with the lower RMSE for the out-of-sample data will have lower forecast error and will be
expected to have be er predic ve power in the future.

LOS 2.h
Most economic and financial me series data are not sta onary. The degree of the nonsta onarity
depends on the length of the series and changes in the underlying economic environment.

LOS 2.i
A random walk me series is one for which the value in one period is equal to the value in another
period, plus a random error. A random walk process does not have a mean rever ng level and is not
sta onary.

LOS 2.j
A me series has a unit root if the coefficient on the lagged dependent variable is equal to one. A series
with a unit root is not covariance sta onary. Economic and finance me series frequently have unit
roots. Data with a unit root must be first differenced before being used in a me series model.

LOS 2.k
To determine whether a me series is covariance sta onary, we can (1) run an AR model and/or (2)
perform the Dickey-Fuller test.

LOS 2.l
Seasonality in a me series is tested by calcula ng the autocorrela ons of error terms. A sta s cally
significant lagged error term corresponding to the periodicity of the data indicates seasonality.
Seasonality can be corrected by incorpora ng the appropriate seasonal lag term in an AR model.
If a seasonal lag coefficient is appropriate and corrects the seasonality, the AR model with the seasonal
terms will have no sta s cally significant autocorrela ons of error terms.

LOS 2.m
ARCH is present if the variance of the residuals from an AR model are correlated across me. ARCH is
detected by es ma ng . If a1 is significant, ARCH exists and the variance of errors can
be predicted using: .

LOS 2.n
When working with two me series in a regression: (1) if neither me series has a unit root, then the
regression can be used; (2) if only one series has a unit root, the regression results will be invalid; (3) if
both me series have a unit root and are cointegrated, then the regression can be used; (4) if both me
series have a unit root but are not cointegrated, the regression results will be invalid.
The Dickey-Fuller test with cri cal t-values calculated by Engle and Granger is used to determine
whether two mes series are cointegrated.

LOS 2.o
The RMSE criterion is used to determine which forecas ng model will produce the most accurate
forecasts. The RMSE equals the square root of the average squared error.

ANSWER KEY FOR MODULE QUIZZES

Module Quiz 2.1


1. A With a trend model, the independent variable is me, t. (LOS 2.b)
2. A The slope coefficient (b1) is posi ve and significantly different from zero indica ng an upward
trend. (LOS 2.a)
3. A The t-sta s c to test the sta s cal significance of the intercept and slope coefficient is the
parameter es mate divided by its standard error. We reject the null hypothesis and conclude the
coefficients are sta s cally significant if the absolute value of the t-sta s c is greater than the
two-tail 5% cri cal t-value with 43 degrees of freedom, which is 2.02.

Both the intercept term and the slope coefficient are significantly different from zero at the 5%
level because both t-sta s cs are greater than the cri cal t-value of 2.02. (LOS 2.a)
4. C
(LOS 2.a)
5. B A log-linear model (choice B) is most appropriate for a me series that grows at a rela vely
constant growth rate. Neither a linear trend model (choice A), nor an AR(1) model (choice C) are
appropriate in this case. (LOS 2.b)

Module Quiz 2.2


1. B Time series X has a definite upward trend, which once again suggests the expected value of the
me series X is not constant, and therefore it is not covariance sta onary. (LOS 2.c)
2. B Given
(LOS 2.d)
3. B Out-of-sample performance is the most important indicator of a model’s real-world forecas ng
ability. In-sample forecast performance is less persuasive, because forecas ng the past is not
difficult. The residuals from the fi ed me-series model are another name for the model’s in-
sample forecast errors. (LOS 2.g)

Module Quiz 2.3


1. A The independent variable is the dependent variable lagged one period, so the model is an AR(1)
model. (Module 2.2, LOS 2.d)
2. C The first-differenced series usually does not have a unit root and is, therefore, covariance
sta onary. (Module 2.3, LOS 2.j)
3. A All random-walk me series have a unit root. Time series with unit root do not have a finite mean-
rever ng level. (Module 2.3, LOS 2.i)
4. A A random walk process does not have a finite mean-rever ng level and hence covariance
nonsta onary. An AR(1) model cannot be used to fit a covariance nonsta onary me series.
(Module 2.3, LOS 2.j)
5. C For a unit root test, the null hypothesis is that the me series has a unit root. For tes ng for unit
roots, the Dickey-Fuller (DF) test computes the conven onal t-sta s c, which is then compared
against the revised set of cri cal values computed by DF. If the test sta s c is significant, we reject
the null hypothesis (that the me series has a unit root), implying that a unit root is not present.
(Module 2.3, LOS 2.k)

Module Quiz 2.4


1. C The standard error of the es mated autocorrela ons is 1/√ _ T, where T is the number of
observa ons (periods). So, if the standard error is given as 0.0632, the number of observa ons, T,
in the me series must be (1 / 0.0632)2 ≈ 250. (Module 2.2, LOS 2.e)
2. A The results in the table indicate that the predic on equa on is xt = 26.8625 + 0.7196xt-1, which is
es mated from an AR(1) model. (Module 2.1 LOS 2.a)
3. A The autocorrela on in the twel h month is not sta s cally different from zero. (p-value: 0.5612 >
0.05) Thus, there appears to be no seasonality. (Module 2.4, LOS 2.l)
4. A If the fourth autocorrela on of the error term differs significantly from 0, this is an indica on of
seasonality. (Module 2.4, LOS 2.l)
5. C Adding an appropriate lag is an appropriate solu on to seasonality. Excluding variables can
some mes be used to solve mul collinearity. Transforming using first-differencing can be a cure
for nonsta onarity. (Module 2.4, LOS 2.l)
6. B The seasonal (annual) lag occurs on a quarterly basis, so the appropriate model is b0 + b1xt-1 + b2xt-
4 + εt. The intercept b0 should be included in the model. (Module 2.4, LOS 2.l)

Module Quiz 2.5


1. C To accurately model a me series that contains shi s, it may be necessary to strategically choose a
longer or shorter sample period, or to use a first- or second-order autoregressive model. There is
no accepted formula for es ma ng the op mal sample period (though a graphical inspec on of
the data may be helpful). (LOS 2.o)
2. B ARCH is present when the variance of the error depends on the variance of previous errors. A zero
autocorrela on of the error term at all lags suggests that an autoregressive model is a good fit to
the data. (LOS 2.m)
3. B If only one me series has a unit root, we should not use linear regression. If neither me series
have unit root, or if both me series have unit root and the me series are cointegrated, linear
regression is appropriate to use. (LOS 2.n)

You might also like