0% found this document useful (0 votes)
36 views14 pages

Adsl Exp 9 2024

Hu

Uploaded by

Jess Lopes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views14 pages

Adsl Exp 9 2024

Hu

Uploaded by

Jess Lopes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

St.

Francis Institute of Technology


SV Road, Borivali (West), Mumbai 400103

Department of Computer Engineering

Academic Year: 2023-2024 Semester: VIII

Subject: Applied Data Science Class / Division: BE/CMPNA

Name :- Jess Lopes Roll Number: 65

Experiment No.: 9

Implement time series forecasting.

Aim : Implement time series forecasting

I OBJECTIVE

To understand basic concepts of time series forecasting.

To explore time series forecasting methods.

II THEORY

The investigation of time series can also be broadly divided into descriptive modeling, called
time series analysis, and predictive modeling, called time series forecasting.

Time series forecasting is a method used to predict future values based on previously
observed values. It is commonly used in areas such as economics, finance, and demand
forecasting for products.

Fig 1. Taxonomy of time series forecasting techniques.

Time Series forecasting can be further classified into four broad categories of techniques:
1. Time Series Decomposition: This approach involves breaking down a time series into
its trend, seasonality, and residual components, and then forecasting each component
separately. The final forecast is obtained by adding the forecasts of each component

1
St. Francis Institute of Technology
SV Road, Borivali (West), Mumbai 400103

2. Smoothing Based Techniques: This category includes methods like moving average
and exponential smoothing that use past values to smooth out noise in the time series
and make predictions

3. Regression Based Techniques: This category includes methods like linear regression
and ARIMA that use a combination of past values and other predictors to make
predictions.
4. Machine Learning Based Techniques: This category includes methods like random
forests, neural networks, and support vector machines that use algorithms from
machine learning to fit models to the time series data and make predictions. These
methods can handle complex and non-linear data and have achieved state-of-the-art
performance in many time series forecasting tasks.

Smoothing methods are a class of time series forecasting techniques that aim to remove noise
and make predictions by averaging the past values of a time series. There are two commonly
used smoothing methods:

1. Average Method: This method involves taking the average of a fixed number of past
values to make predictions. For example, if we have a time series y with n values, the
average method would predict the next value as the average of the last n values.

2. Moving Average Smoothing: This method is similar to the average method, but
instead of using a fixed number of past values, it uses a sliding window of a fixed size
to average the past values. For example, if the window size is m, the moving average
method would predict the next value as the average of the last m values.

Regression Based Techniques

1. Linear Regression: Time series analysis can also be performed using linear
regression, which is a supervised machine learning method used to model the
relationship between a dependent variable and one or more independent variables. In
time series analysis, the dependent variable is the time series data, and the
independent variable is time.

2. ARIMA Model: ARIMA is a commonly used method for time series analysis and
forecasting, which stands for Autoregressive Integrated Moving Average. It is a type
of regression-based model that uses past values of the time series to make predictions.

III IMPLEMENTATION

import pandas as pd

import matplotlib.pyplot as plt


import numpy as np
dataset =
pd.read_csv('https://raw.githubusercontent.com/maks-p/restaurant_sales_forecasting/master/c
sv/CSV_for_EDA.csv')

total_data = dataset["date"].count()
split = int(total_data * 0.90)
train = dataset[:split]
test = dataset[split:]
plt.figure(figsize=(12, 8))
plt.plot(train.date, train.inside_sales, label='Train')
plt.plot(test.date, test.inside_sales, label='Test')
plt.xticks(rotation='vertical')
plt.legend(loc='best')
plt.title("Train Test Split")
2
St. Francis Institute of Technology
SV Road, Borivali (West), Mumbai 400103

plt.show()

from statsmodels.tsa.ar_model import AutoReg


model_ag = AutoReg(endog = train ["inside_sales"], \

lags = 7, \
trend='c', \
seasonal = False, \
exog = None, \
hold_back = None, \
period = None, \
missing = 'none')
fit_ag = model_ag.fit()
print("Coefficients: \n%s" % fit_ag.params)
Coefficients:
const 9441.469011
inside_sales.L1 0.128783
inside_sales.L2 -0.051229
inside_sales.L3 0.004284
inside_sales.L4 -0.072142
inside_sales.L5 0.011292
inside_sales.L6 0.123930
inside_sales.L7 0.209163
dtype: float64
predictions = fit_ag.predict(start=len(train), \
end=len(train)+len(test)-1, \
dynamic=False)
predictions.name = "Predictions"
result = pd.concat([test, predictions], axis=1).reindex(test.index)
print(result)
3
St. Francis Institute of Technology
SV Road, Borivali (West), Mumbai 400103

from sklearn.metrics import mean_squared_error


from math import sqrt

rmse = sqrt(mean_squared_error(test["inside_sales"], predictions))


print("AR Root Mean Square Error (RMSE): %.3f" % rmse)
AR Root Mean Square Error (RMSE): 2427.322
from statsmodels.tsa.arima.model import ARIMA
model_ma = ARIMA(endog = train["inside_sales"], \

order=(0, 0, 2))
#endog: dependent variable, response variable or y (endogenous)
#order: order of the model for the autoregressive, differences & moving average components.

fit_ma = model_ma.fit()
print("Coefficients: \n%s" % fit_ma.params)
Coefficients:
const 1.460960e+04
ma.L1 1.679819e-01
ma.L2 -3.590835e-02
sigma2 5.585174e+06
dtype: float64
predictions_ma = fit_ma.predict(start = len(train), \
end = len(train)+len (test)-1, \
dynamic = False)

predictions_ma.name = "Predictions"
result_ma = pd.concat([test, predictions_ma], axis=1).reindex(test.index)
print (result_ma)

from sklearn.metrics import mean_squared_error


from math import sqrt
rmse_ma = sqrt(mean_squared_error(test["inside_sales"], predictions_ma))
4
St. Francis Institute of Technology
SV Road, Borivali (West), Mumbai 400103

print("MA - Root Mean Square Error (RMSE): %.3f" % rmse_ma)

MA - Root Mean Square Error (RMSE): 2457.157


plt.figure(figsize=(12,8))
plt.plot(train.date, train.inside_sales, label='Train')
plt.plot(test.date, test.inside_sales, label='Test')
plt.plot(result_ma.date, result_ma. Predictions, label='Prediction')
plt.xticks(dataset ["date"], dataset ["date"], rotation='vertical')
plt.legend(loc='best')
plt.title("Predictions MA model")
plt.xlabel('date')
plt.ylabel('inside_sales')
plt.show()

from statsmodels.tsa.arima.model import ARIMA


model_arima = ARIMA(endog = train["inside_sales"], \

order = (1, 1, 1))


fit_arima = model_arima.fit()
print("Coefficients: \n%s" % fit_arima.params)
Coefficients:
ar.L1 1.433320e-01
ma.L1 -9.792971e-01
sigma2 5.628931e+06
dtype: float64
predictions_arima = fit_arima.predict(start =
len(train), \ end = len(train)+len
(test)-1, \ dynamic = False)
predictions_arima.name = "Predictions"
result_arima = pd.concat([test, predictions_arima], axis=1).reindex(test.index)
print (result_arima)

5
St. Francis Institute of Technology
SV Road, Borivali (West), Mumbai 400103

from sklearn.metrics import mean_squared_error


from math import sqrt

rmse_arima = sqrt(mean_squared_error(test ["inside_sales"], predictions_arima))


print("ARIMA Root Mean Square Error (RMSE): %.3f" % rmse_arima)
ARIMA Root Mean Square Error (RMSE):
2387.420 plt.figure(figsize=(12,8))
plt.plot(train.date, train.inside_sales, label='Train')
plt.plot(test.date, test.inside_sales, label='Test')
plt.plot(result_arima.date, result_arima. Predictions, label='Prediction')
plt.xticks (dataset ["date"], dataset ["date"], rotation='vertical')
plt.legend(loc='best')
plt.title("Predictions ARIMA model")
plt.xlabel('date')
plt.ylabel('inside_sales')
plt.show()

from statsmodels.tsa.statespace.sarimax import SARIMAX


model_sarima = SARIMAX(endog = train["inside_sales"], \

order = (1, 1, 1), \


seasonal_order=(0, 0, 0, 0))
fit_sarima = model_sarima.fit()
6
St. Francis Institute of Technology
SV Road, Borivali (West), Mumbai 400103

print("Coefficients: \n%s" % fit_sarima.params)

Coefficients:
ar.L1 1.433320e-01
ma.L1 -9.792971e-01
sigma2 5.628931e+06
dtype: float64
predictions_sarima = fit_sarima.predict(start = len(train), \
end = len(train)+len(test)-1, \
dynamic = False)
predictions_sarima.name = "Predictions"
result_sarima = pd.concat([test, predictions_sarima], axis=1) \
.reindex(test.index)
print (result_sarima)

from sklearn.metrics import mean_squared_error


from math import sqrt

rmse_sarima =
sqrt(mean_squared_error(test["inside_
sales"], \ predictions_sarima))
print("SARIMA Root Mean Square Error (RMSE): %.3f" % rmse_sarima)
SARIMA Root Mean Square Error (RMSE): 2387.420
plt.figure(figsize=(12,8))
plt.plot(train.date, train.inside_sales, label='Train')
plt.plot(test.date, test.inside_sales, label='Test')
plt.plot(result_sarima.date, result_sarima. Predictions, label='Prediction')
plt.xticks (dataset ["date"], dataset ["date"], rotation='vertical')
plt.legend(loc='best')
plt.title("Predictions SARIMA model")
plt.xlabel('date')
plt.ylabel('inside_sales')
plt.show()
7
St. Francis Institute of Technology
SV Road, Borivali (West), Mumbai 400103

IV CONCLUSION

We have understood the basic concepts of time series forecasting and implemented it.

V REFERENCES

https://www.justintodata.com/arima-models-in-python-time-series-prediction/

https://machinelearningmastery.com/time-series-forecasting-methods-in-python-cheat-
s heet/

https://cprosenjit.medium.com/10-time-series-forecasting-methods-we-should-know-
291 037d2e285

VI POST LAB QUESTION/ANSWER

1. Explain what are the three components of ARIMA model

The ARIMA model is a combination of three components: the autoregression (AR)


component, the differencing (I) component, and the moving average (MA)
component. The autoregression component models the relationship between the
current value and past values, the differencing component removes the non-
stationarity from the time series data, and the moving average component models the
errors in the time series.
8

You might also like