0% found this document useful (0 votes)
25 views15 pages

Intro To Forecasting

This document introduces forecasting models and discusses how to evaluate them. The typical steps are to visualize data, come up with candidate models, evaluate models using validation data, evaluate the best model on test data, and deploy the model. Rolling forecast involves fitting a model on an initial training period and using it to forecast future periods, then increasing the training period and repeating. The document discusses coding forecasting procedures as methods to avoid copy/pasting code. It also introduces analyzing OLS regression coefficient estimators through histograms, Q-Q plots, and mean squared error to test for consistency and asymptotic normality. Code examples are provided to analyze the OLS estimator of an AR(1) coefficient and the estimator of a coefficient between independent AR
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views15 pages

Intro To Forecasting

This document introduces forecasting models and discusses how to evaluate them. The typical steps are to visualize data, come up with candidate models, evaluate models using validation data, evaluate the best model on test data, and deploy the model. Rolling forecast involves fitting a model on an initial training period and using it to forecast future periods, then increasing the training period and repeating. The document discusses coding forecasting procedures as methods to avoid copy/pasting code. It also introduces analyzing OLS regression coefficient estimators through histograms, Q-Q plots, and mean squared error to test for consistency and asymptotic normality. Code examples are provided to analyze the OLS estimator of an AR(1) coefficient and the estimator of a coefficient between independent AR
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Introduction to Forecasting

Ruitian Lang
August 21, 2023

1 Overview
The typical steps of developing a forecasting model are as follows:
1. Visualize the data to get an intuitive idea what they look like, whether there are trends and
seasonality. With experience, one may also have a good idea of whether the data possess a
unit root.
2. Come up with several candidate models.
3. Evaluate the candidate models. There are two methods of doing this: (a) use some rule-of-
thumb criteria; (b) test the performance of the models in a validation set.
4. Evaluate the performance of the chosen model using the test set.
5. Deploy the model. Based on its performance with real data, the model may be adjusted after
deployment.
In evaluating a forecasting model, we often perform rolling forecast: 1. Let 𝑇 be the length of
the training period. 2. Fit the model with the first 𝑇 periods of data. 3. Use the fitted model
to forecast periods 𝑇 + 1 to 𝑇 + ℎ. The positive number ℎ is called the length of the forecasting
window. Write down the forecast result. 4. Increase 𝑇 to 𝑇 + ℎ and go back to Step 2.
Two features are worth noting in the rolling forecast procedure. * It involves a repeating Steps 2
through 4 multiple times. This is done by a loop in Python. * The rolling forecast procedure is the
same no matter which model is used. Therefore, it is a good idea to write some code that works
for all forecasting models.
The second point reflects an important principle in programming: you should AVOID copy and
paste your code as much as possible. The main reason for this principle is that copying-and-pasting
makes maintaining the code very difficult, as it is impossible for a programmer to remember making
the same change to every copy of the code.
Because we may perform some procedure such as rolling forecast multiple times, it is useful to
code that as a method. Sometimes, coding something as a method does not satisfy our need. For
example, an 𝐴𝑅𝑀 𝐴(𝑝, 𝑞) forecasting model should remember its 𝑝 and 𝑞, while a method cannot
remember anything. In these situations, we should create our own class. Roughly speaking, a class
is a collection of methods which all need to remember something.

2 Task 5: Properties of OLS regressions


In this task, we examine the consistency and asymptotic normality of some OLS regressions related
to time series. For the purpose of this task, we will always focus on a single coefficient, say 𝜃. We
want to know whether the estimator 𝜃 ̂ of that particular coefficient 𝜃 is consistent and asymptotically

1
normal. We will generate many samples from the same data generating process and run the same
OLS regression on each sample. This way, we will obtain many i.i.d. draws of the estimator 𝜃.̂ If
they are all close to the true value of 𝜃, which we know from our data generating process, then it is
evidence that 𝜃 ̂ is consistent. When this is the case, we can further explore the sample distribution
of 𝜃 ̂ and see whether it is Gaussian.
Because we will need to run the same linear regression multiple times, it is useful to implement this
step as a method so that we may use it for different OLS models. The standard format of defining
a method is as follows:
def name_of_method(param1, param2, …): “ “ ” Documentation of this method “ “ ”
(body_of_the_method) return result;
Very importantly, Python uses indentation (meaning the number of spaces at the beginning of each
line) to determine where the body of the method (or other things) begins and ends. All statements
inside the body of a method must have the same indentation.

[ ]: import numpy as np
import statsmodels.api as sm
import pandas as pd

def OLS_coefficient(dep_variable, indep_variable, which_coefficient = 1):


"""
Return a particular coefficient in an OLS regression.

Parameters
----------
dep_variable: numpy Array or pandas DataFrame.
The depedent variable.
indep_variable: numpy Array or pandas DataFrame.
The independent variables not including constant.
which_coefficient: int
Which regression coefficient should be returned. By default, it is the␣
↪coefficient of the first regressor.

Returns
-------
float64
The desired regression coefficient
"""

indep_variable = sm.add_constant(pd.DataFrame(indep_variable));
OLS_model = sm.OLS(dep_variable, indep_variable, missing = 'drop');
fit_result = OLS_model.fit();
return (fit_result.params.to_numpy())[which_coefficient];

Before we create any specific OLS model, it is useful to code up a method that analyzes the result.
For every regression model, we will have a true parameter value 𝜃 and an i.i.d. sample of the
estimator 𝜃.̂ We can examine how close the sample points are to 𝜃 via a histogram. It is also

2
useful to report the mean-squared-error (or MSE) of the estimation. We want to see that the MSE
diminishes with the number of observations.
To determine whether the estimator is asymptotically normal, we will compare our sample distri-
bution (after subtracting the true value) with the standard normal distribution. If the estimator is
asymptotically normal, then its sample distribution should be close to a constant multiple of the
standard normal distribution. We can visualize whether this is the case by making the quantile-
quantile (or Q-Q) plot of the sample distribution divided by its standard deviation against the
standard normal distribution. If the estimator is asymptotically normal, the plot should be along
the 45-degree line, which is represented by the red line in the Q-Q plot.

[ ]: def analyze_estimator(sample, truth = 0, eps_file_path = None):


"""
Analyze the sample of an estimator by looking at its histogram and Q-Q plot␣
↪against the standard normal distribution.

Parameters
----------
sample: 1-D Array
A sample of the estimator.
truth: float
The true value of the parameter, 0 by default.
eps_file_path: string or None
If this is not None, then the plots are saved in the specified path.

Returns
-------
float
MSE of the sample

"""

import matplotlib.pyplot as plt


plt.rc('font', size = 12);
fig, [ax1, ax2] = plt.subplots(nrows = 2, ncols = 1);
ax1.hist(sample, bins = 21);
sm.qqplot((sample - truth) / np.std(sample), ax = ax2, line = '45');
plt.show();
return np.mean(np.power(sample - truth, 2));

Now we will use our tools to analyze the OLS estimator of AR coefficients. For simplicity, we will
focus on 𝐴𝑅(1) processes. The same technique can be used to analyze 𝐴𝑅(𝑝) processes for 𝑝 > 1.
We generate 𝑁 i.i.d. sample paths of the AR process, which consists of 𝑇 periods. We wish to run
an OLS regression for each column of data. For this purpose, we need to use a loop. The standard
format of the loop is as follows. for i in range(k): (body_of_the_loop)
The code does the following. First, Python puts i to be the first entry of the list following the
keyword “in”. Then it executes the body of the loop. Then it moves i to the second entry of the list
and executes the body again. Here, the list is non-negative integers less than k; it does not have

3
to be an equal-spaced list though. It could be something like [5, -1, 3].
We wish to store the values of the estimator in an Array. Because we know that the size of the
Array is N, the standard approach is to first define the Array as N zeros and then modify its entries
later on. As mentioned before in this course, Arrays are meant to be mutable.

[ ]: a1 = 1;
ar1_process = sm.tsa.ArmaProcess([1, -a1]);
T = 100;
N = 10000;
y = ar1_process.generate_sample([T, N]);

coef_sample = np.zeros(N);
for i in range(N):
coef_sample[i] = OLS_coefficient(y[1:, i], y[: (T - 1), i]);
print('MSE = ', analyze_estimator(coef_sample, a1));

Next, we analyze the OLS regression of one ARMA process 𝑦𝑡 on another ARMA process 𝑥𝑡
independent of 𝑦𝑡 . In the regression equation
𝑦𝑡 = 𝑤𝑥𝑡 + 𝑏 + 𝜖𝑡 ,
the true value of the coefficient 𝑤 is clearly zero. If the OLS estimator can consistently estimate
this zero, then when we have two correlated ARMA processes, the OLS regression will also be able
to estimate the coefficient 𝑤, because such a process is a linear combination of 𝑥𝑡 and some 𝜖𝑡 .

[ ]: dg_process_y = sm.tsa.ArmaProcess([1, -1]);


dg_process_x = sm.tsa.ArmaProcess([1, -.8, -.2], [1, .7, .4]);
T = 100;
N = 1000;
y = dg_process_y.generate_sample([T, N]);
x = dg_process_x.generate_sample([T, N]);

coef_sample = np.zeros(N);
for i in range(N):
coef_sample[i] = OLS_coefficient(y[:, i], x[:, i]);
print('MSE = ', analyze_estimator(coef_sample));

3 Task 6: Implementing Rolling Forecast


In this task, we will perform a rolling forecast with a given model and leave the discussion of model
selection for later. As mentioned in the Overview, this task should be implemented a method that
works for any forecasting model. For this reason, the method should take a forecasting model as a
parameter.
What type should this “model” parameter be? The answer is that we will define a new type! A data
type is formally called a class in Python. In fact, NumPy Array, Pandas DataFrame, StatsModels
ArmaProcess are all classes. An object of a particular class is called an instance of that class. For
example, we created several instances of the class ArmaProcess in the previous task.

4
We will be implementing several forecasting models later. Therefore, it is useful to stipulate what
all these models should be able to do. This step can be done formally with an abstract class and
all forecasting models will be declared as children (or subclasses) of this abstract class. I believe
that this is a neat solution, but we will not go to that level of abstraction here. Let us stipulate
informally that every forecasting model should implement a method called “forecast”, which takes
an Array and an integer (length of the forecasting window) as parameters and produces the mean
forecast and its standard deviation.
Mean-squared-error is a standard performance measure of forecasting models. However, it is im-
portant to visualize the forecasts to know whether the model is doing what it is supposed to do.
For this purpose, we plot the actual data (blue line) and the forecasts (red line) on the same graph.
Also, we show the error band (defined as the band between mean - sd_mean and mean + sd_mean)
in yellow. When the mean-squared-error is too large for our like, we want to understand why. If
the actual data are in the error band of the forecasting model but the band is too wide, then the
model is probably sensible but it does not have the power of forecasting for the horizon of our
forecasting window. If the actual data frequently lie outside the error band, then the model is
probably misspecified.

[ ]: def rolling_forecast(data, model, len_training = None, window = 1, plot_result␣


↪= True, eps_path = None):

"""
Perform a rolling forecast so that the result can be compared with the true␣
↪observations.

Parameters
----------
data: Numpy Array
This consists of both the training set and the validation/test set.
model: Object
This object must have implemented a method get_name() and a method␣
↪forecast(data, window), which returns an array of mean forecast and an array␣

↪of standard error of the mean.

len_training: int or None


Length of the training period. Forecasting starts at the end of the␣
↪training period. If not supplied, it is .8 times the number of observations.

window: int
Length of the forecasting window; 1 by default.
eps_path: String or None
If it is not None, it is the file path in which the graph is saved.

Returns
-------
int
MSE of the forecast
Two Arrays
The first array is the mean forecast and the second array is its␣
↪standard deviation.

5
"""

# The following two lines should be deleted if not using the abstract class␣
↪approach.

if (not isinstance(model, ForecastingModel)):


raise TypeError('The parameter model is supposed to be an instance of␣
↪ForecastingModel.');

T = np.shape(data)[0];
if (len_training is None):
len_training = int(T * .8);
mean_forecast = np.zeros(T - len_training);
se_mean = np.zeros(T - len_training);

for i in range(len_training, T, window):


history = data[:i];
mean_forecast_i, se_mean_i = model.forecast(history, window); #␣
↪forecast of the current window

if (i + window >= T): # the end of the forecasting window is beyond T;


↪ we truncate the forecast

mean_forecast[(i - len_training) : (T - len_training)] =␣


↪mean_forecast_i[: (T - i)];

se_mean[(i - len_training) : (T - len_training)] = se_mean_i[: (T -␣


↪i)];

else:
mean_forecast[(i - len_training) : (i - len_training + window)] =␣
↪mean_forecast_i;

se_mean[(i - len_training) : (i - len_training + window)] =␣


↪se_mean_i;

MSE = np.mean(np.power(mean_forecast - data[len_training : T], 2));

# Next, we plot the forecasting result

if (plot_result):
import matplotlib.pyplot as plt
plt.rc('font', size = 12)
plt.plot(range(len_training, T), data[len_training : T], 'b-', label =␣
↪'Data');

plt.fill_between(range(len_training, T), mean_forecast - se_mean,␣


↪mean_forecast + se_mean, fc = 'y');

plt.plot(range(len_training, T), mean_forecast, 'r-.', label = model.


↪get_name());

plt.legend(loc = 'upper left');


plt.xlabel('Time');
plt.ylabel('Output');
if (eps_path is not None):

6
plt.savefit(eps_path);
else:
plt.show();

return MSE, mean_forecast, se_mean;

We will implement some more sophisticated forecasting models later on. For now, let us implement
a naive model: RandomWalkForecaster. This model assumes that the underlying data generating
process is a random walk:
𝑦𝑡 = 𝑦𝑡−1 + 𝜖𝑡 ,
where {𝜖𝑡 } is a white noise. Since 𝜖𝑡 cannot be forecast, our mean forecast for 𝑦𝑡 is the last observed
value. If the last observed value is 𝑦𝑠 and we wish to forecast 𝑦𝑠+ℎ , then
𝑦𝑠+ℎ = 𝜖𝑠+ℎ + 𝜖𝑠+ℎ−1 + ... + 𝜖𝑠+1 + 𝑦𝑠 .

This implies that the standard error of our forecast is simply ℎ times the standard deviation of
the white noise.
The residual of a model is defined as the part of data that cannot be explained by the model.
Obviously, we always want our model to explain as much of the data as possible. In the context
of time series forecasting, we often hope that the residual is white noise. For this purpose, we
implement a residual method of our random walk model. In this case, the residual of the model
is 𝑦𝑡 − 𝑦𝑡−1 . If the underlying data generating process is indeed a random walk, then this residual
is white noise. Finally, many models produce a performance measure called AIC which is often
used for model selection. We stipulate that every forecasting model implement an aic method, but
simply produce a ‘nan’ (missing data) for the AIC of the random walk model.
Our RandomWalkForecaster is so simple that it does not need to remember anything. Therefore, we
only need to implement the relevant methods for our model. It is important to remember that the
first parameter of any class member is always self. This self parameter is never explicitly passed to
the class method when the method is called, but it allows the method to access information stored
in the object.

[ ]: # The entire definition of ForecastingModel should be deleted if not using the␣


↪abstract class approach.

# I recommend installing the package docstring-inheritance. If you do not want␣


↪to install that package. If you have that package, replace the first line␣

↪"class ForecastingModel:" with the following two lines.

# from docstring_inheritance import NumpyDocstringInheritanceMeta


# class ForecastingModel(metaclass=NumpyDocstringInheritanceMeta):

class ForecastingModel:
"""
The abstract class of forecasting models. All forecasting models should be␣
↪subclasses of this one.

Methods

7
-------
get_name()
The name of the model.
forecast(data, window)
Returns the mean forecast and the standard error of the mean given␣
↪historical data and the forecasting window.

aic(data)
Returns the AIC of the model.
residuals(data)
Returns the residuals of the model, the part of data that cannot be␣
↪explained by the model.

"""

def get_name(self):
"""
The name of the model.

Returns
-------
string:
Name of the model.
"""

return 'The abstract forecasting model';

def forecast(self, data, window):


"""
Returns the mean forecast and the standard error of the mean given␣
↪historical data and forecasting window.

Parameters
----------
data: 1-D Array
Historical data.
window: int
Length of the forecasting window.

Returns
-------
Two Arrays
The mean forecast and the standard error. The length of both Arrays␣
↪is window.

"""

raise NotImplementedError('The abstract class ForecastingModel does not␣


↪implement the forecast method.');

8
def aic(self, data):
"""
Returns the AIC of the model.

Parameters
----------
data: 1-D Array
Historical data.

Returns
-------
float
The AIC of the model
"""

raise NotImplementedError('The abstract class ForecastingModel does not␣


↪implement the aic method.');

def residual(self, data):


"""
Returns the residual of the model.

Parameters
----------
data: 1-D Array
Historical data.

Returns
-------
1-D Array
The residuals of the model. The length of the array is the same as␣
↪the length of data.

"""

raise NotImplementedError('The abstract class ForecastingModel does not␣


↪implement the residuals method.');

[ ]: # In the class declaration, "(ForecastModel)" should be deleted if not using␣


↪the abstract class approach

class RandomWalkForecaster(ForecastingModel):
"""
A forecasting model assuming that the data generation process is a random␣
↪walk.

"""

def get_name(self):

9
return 'Random walk';

def forecast(self, data, window):


T = np.shape(data)[0];
sigma = np.power(np.mean(np.power(np.diff(data), 2)), .5);
mean_forecast = np.ones(window) * data[T - 1];
se_mean = np.power(np.arange(window) + 1, .5) * sigma;
return mean_forecast, se_mean;

def aic(self, data):


return float('nan');

def residual(self, data):


T = np.shape(data)[0];
result = np.zeros(T);
result[0] = data[0];
result[1 : ] = np.diff(data);
return result;

It is time to test the performance of our random walk model. We will not use any real data for
this purpose. Instead, we generate a sample of a random walk and a sample of an ARMA process
that is not a random walk and let the model forecast these two time series. We see that when the
underlying data generating process is not a random walk, the actually data frequently go beyond
the forecast error band, suggesting that our random walk model is misspecified.

[ ]: dg_process = sm.tsa.ArmaProcess([1, -.7, -.3]);


T = 200;
data = dg_process.generate_sample(T);
model = RandomWalkForecaster();
rolling_forecast(data, model);

4 Task 7: Implementing the ARIMA(p, d, q) Forecaster


The heavy lifting has actually been done in the previous task. Implementing the ARIMA(p,d,q)
forecaster is now quite easy. The only thing to note is that the model has a init method. This is
called the constructor method and is the method that is called automatically when an instance of
this class is created. As the name suggests, it initializes the model. In this case, this is done by
setting the AR, integration and MA orders of the model.

[ ]: from statsmodels.tsa.arima.model import ARIMA

class ARIMAForecaster(ForecastingModel):
"""
The forecasting model assuming that the underlying data generating process␣
↪is ARIMA(p,d,q).

"""

10
def __init__(self, p = 0, d = 0, q = 0):
"""
Parameters
----------
p: int
The AR order of the model.
d: int
The integration order of the model.
q: int
The MA order of the model.
"""

self.order = (p, d, q);

def get_name(self):
return 'ARIMA' + str(self.order);

def forecast(self, data, window):


T = np.shape(data)[0];
model = ARIMA(data, order = self.order);
model_fit = model.fit();
prediction = model_fit.get_prediction(T, T + window - 1);
return prediction.predicted_mean, prediction.se_mean;

def aic(self, data):


model = ARIMA(data, order = self.order);
return model.fit().aic;

def residual(self, data):


model = ARIMA(data, order = self.order);
return model.fit().resid;

Now that we have implemented two forecasting models, we can compare their performance. First,
let us test the models with simulated data.

[ ]: dg_process = sm.tsa.ArmaProcess([1, -.7, -.3]);


T = 100;
simulated_data = dg_process.generate_sample(T);
model1 = RandomWalkForecaster();
model2 = ARIMAForecaster(2, 1, 0);
MSE1, m1, se1 = rolling_forecast(simulated_data, model1, window = 1);
MSE2, m2, se2 = rolling_forecast(simulated_data, model2, window = 1);
print('MSE of ', model1.get_name(), ' is ', MSE1);
print('MSE of ', model2.get_name(), ' is ', MSE2);

Now let us use some real data, which is an hourly bandwidth usage data set. Apart from reading
the data from a csv file into the memory, there is no difference between this and the exercise with
simulated data.

11
[ ]: bandwidth_data_set = pd.read_csv('bandwidth.csv');
bandwidth_data_array = bandwidth_data_set['hourly_bandwidth'].to_numpy();
MSE1, m1, se1 = rolling_forecast(bandwidth_data_array, model1, len_training =␣
↪9800, window = 2);

MSE2, m2, se2 = rolling_forecast(bandwidth_data_array, model2, len_training =␣


↪9800, window = 2);

print('MSE of ', model1.get_name(), ' is ', MSE1);


print('MSE of ', model2.get_name(), ' is ', MSE2);

5 Task 8: Model Selection


Let us assume that we will be using an ARIMA model. The first step is to determine the integration
order. This is done by the following procedure.
1. Set d = 0.
2. If ADF test rejects unit root for the data, then return d.
3. Difference data once and increase d by 1. Go back to Step 2.
This can be done either by hand or automatically with a Python loop.
The selection of 𝑝 and 𝑞 is trickier. The danger of choosing too small a model (in terms of the
number of parameters 𝑝 + 𝑞) is under-fitting: the model fails to capture the important features in
the model. The danger of choosing too big a model is over-fitting: the model picks up too many
idiosyncratic features in the training data, resulting in a poor forecasting performance.
We demonstrate the problems of under-fitting and over-fitting with simulated data. We simulate
an AR(3) process and fit with an AR(p) process for 𝑝 ranging from 1 to 10. Then we plot the MSE
of the rolling forecast using each model.

[ ]: dg_process = sm.tsa.ArmaProcess([1, -.3, -.2, -.25]);

def plot_AR_overfit(dg_process, T, N, max_p):

data = dg_process.generate_sample([T, N]);


forecast_MSE = np.zeros(max_p + 1);
aic = np.zeros(max_p + 1);

for p in range(max_p + 1):


model_p = ARIMAForecaster(p);
def mse_j(data):
mse, mean_j, se_j = rolling_forecast(data, model_p, len_training =␣
↪T - 20, plot_result = False);

return mse;
mse_p = np.apply_along_axis(mse_j, 0, data);
forecast_MSE[p] = np.mean(mse_p);
aic_p = np.apply_along_axis(model_p.aic, 0, data[: T - 20, :])
aic[p] = np.mean(aic_p);

12
import matplotlib.pyplot as plt
plt.rc('font', size = 12);
fig, [ax1, ax2] = plt.subplots(nrows = 2, ncols = 1);
ax1.plot(range(max_p + 1), forecast_MSE);
ax1.set(xlabel = 'Max lag', ylabel = 'MSE', xlim = [0, max_p + 1]);
ax2.plot(range(max_p + 1), aic);
ax2.set(xlabel = 'Max lag', ylabel = 'AIC', xlim = [0, max_p + 1]);
plt.tight_layout();
plt.savefig('AR_overfit_' + str(T) + '.eps');

# plot_AR_overfit(dg_process, 100, 100, 10);


# plot_AR_overfit(dg_process, 500, 100, 10);

There are two approaches for model selection. One is to set aside a validation set and compare
the performance of different models (estimated without using the validation set). However, for
standard models like ARIMA, there is an easier way. When we fit an ARIMA model, the algorithm
computes a performance measure called Akaike Information Criterion (AIC). It is defined as twice
the difference between the number of parameters and log likelihood. The smaller AIC is, the better
forecasting power the model rates to have. From the definition, we can see that it rewards good
fit (high log likelihood) and punishes too many parameters. For ARIMA models, the standard
approach is to select models based on AIC.
AIC is mainly used as a tool to avoid over-fitting. Even with AIC, there is still a possibility of
under-fitting. When our model is under-fitting, there is a forecastable component of the residual,
so it will not be white noise. In contrast, the residual of a well-specified ARIMA model is always
white noise. The Ljung-Box test can be used to detect under-fitting: it tests whether a time series
is white noise by seeing whether its first few autocorrelations (at positive lags) are all zero. It is
important to note that we must set the correct model_df parameter in the Ljung-Box test method
to get the correct p-values.

[ ]: from statsmodels.stats.diagnostic import acorr_ljungbox


from statsmodels.tsa.stattools import adfuller

def select_arima_model(training_data, max_p = 5, max_q = 5):


"""
Select the ARIMA(p,d,q) model that generates the lowest AIC.

Parameters
----------
training_data: 1-D Array
The series of training data. If a validation set will be used for␣
↪evaluating models, it should NOT be included.

max_p: int
The maximum AR order to be considered.
max_q: int
The maximum MA order to be considered.

Returns

13
-------
DataFrame
It consists of five columns, p, d, q, model, and aic. The model column␣
↪consists of the ARIMAForecaster objects. The DataFrame is sorted by AIC.

"""

num_models = (max_p + 1) * (max_q + 1);


ps = np.zeros(num_models);
qs = np.zeros(num_models);
aics = np.zeros(num_models);

# Step 1 Determine the integration order


d = 0;
stationary = False;
diff_d_data = training_data; # data after differenced for d times

while (not stationary):


adf_p = adfuller(diff_d_data)[1];
if (adf_p < .05): # unit root is rejected
stationary = True;
else:
diff_d_data = np.diff(diff_d_data);
d += 1;

models = [];
counter = 0;
for p in range(max_p + 1):
for q in range(max_q + 1):
models.append(ARIMAForecaster(p, d, q));
ps[counter] = p;
qs[counter] = q;
aics[counter] = models[counter].aic(training_data);
counter += 1;
result = pd.DataFrame({'p': ps, 'd': d, 'q': qs, 'model': models, 'aic':␣
↪aics});

return result.sort_values('aic');

[ ]: training = bandwidth_data_array[:9800];
models = select_arima_model(bandwidth_data_array[:9800], 4, 4);
optimal_model = models['model'].iloc[0]; # choose the model with the smallest␣
↪AIC

print('The optimal model is ', optimal_model.get_name());


print('Below is the result of the Ljung-Box test on the residual of the␣
↪selected model.');

print(acorr_ljungbox(optimal_model.residual(training), model_df = 5)); # this␣


↪number 5 should be p + q of the optimal model

14
[ ]: mse, mean_forecast, se_mean = rolling_forecast(bandwidth_data_array,␣
↪optimal_model, len_training = 9800, window = 2);

print('MSE of ', optimal_model.get_name(), ' is ', mse);


benchmark_model = RandomWalkForecaster();
mse0, mean_forecast0, se_mean0 = rolling_forecast(bandwidth_data_array,␣
↪benchmark_model, len_training = 9800, window = 2);

print('MSE of Random Walk is ', mse0);

[ ]:

15

You might also like