0% found this document useful (0 votes)
49 views5 pages

Time Series Analysis Time Series Analysis

Time series analysis and forecasting algorithms are used on furniture sales data from a Superstore sales dataset covering 4 years. Two algorithms are used: 1) ARIMA is used to generate a one-step ahead forecast and 100 step forecast for furniture sales. 2) The Prophet algorithm is used to forecast furniture and office supplies sales, generating forecasts for each out to 36 months. Results are plotted for each.

Uploaded by

Nimish Agrawal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views5 pages

Time Series Analysis Time Series Analysis

Time series analysis and forecasting algorithms are used on furniture sales data from a Superstore sales dataset covering 4 years. Two algorithms are used: 1) ARIMA is used to generate a one-step ahead forecast and 100 step forecast for furniture sales. 2) The Prophet algorithm is used to forecast furniture and office supplies sales, generating forecasts for each out to 36 months. Results are plotted for each.

Uploaded by

Nimish Agrawal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Time Series Analysis

Algorithms are used on superstore sales data.There are several categories in the Superstore sales data, we
start from time series analysis and forecasting for furniture sales.

We will use

ARIMA Algorithm
Prophet Algorithm

Dataset : Superstore Sales Data Dataset contains information regarding sales over 4 years at a supermarket

Dataset Link : https://community.tableau.com/docs/DOC-1236

Data is availaible till 2017 only we will predict till 2021

In [ ]:
import warnings
import itertools
import numpy as np
import matplotlib.pyplot as plt
warnings.filterwarnings("ignore")
plt.style.use('fivethirtyeight')
import pandas as pd
import statsmodels.api as sm
import matplotlib
matplotlib.rcParams['axes.labelsize'] = 14
matplotlib.rcParams['xtick.labelsize'] = 12
matplotlib.rcParams['ytick.labelsize'] = 12
matplotlib.rcParams['text.color'] = 'k'

First we will do analysis on Furniture category using ARIMA Algorithm

In [ ]:
df = pd.read_excel("Superstore.xls")
furniture = df.loc[df['Category'] == 'Furniture']

In [ ]:

cols = ['Row ID', 'Order ID', 'Ship Date', 'Ship Mode', 'Customer ID', 'Customer Name',
'Segment', 'Country', 'City', 'State', 'Postal Code', 'Region', 'Product ID', 'Category'
, 'Sub-Category', 'Product Name', 'Quantity', 'Discount', 'Profit']
furniture.drop(cols, axis=1, inplace=True)
furniture = furniture.sort_values('Order Date')
furniture.isnull().sum()

Out[ ]:
Order Date 0
Sales 0
dtype: int64

In [ ]:

furniture = furniture.groupby('Order Date')['Sales'].sum().reset_index()

In [ ]:
furniture = furniture.set_index('Order Date')
furniture.index
Out[ ]:

DatetimeIndex(['2014-01-06', '2014-01-07', '2014-01-10', '2014-01-11',


'2014-01-13', '2014-01-14', '2014-01-16', '2014-01-19',
'2014-01-13', '2014-01-14', '2014-01-16', '2014-01-19',
'2014-01-20', '2014-01-21',
...
'2017-12-18', '2017-12-19', '2017-12-21', '2017-12-22',
'2017-12-23', '2017-12-24', '2017-12-25', '2017-12-28',
'2017-12-29', '2017-12-30'],
dtype='datetime64[ns]', name='Order Date', length=889, freq=None)

In [ ]:
y = furniture['Sales'].resample('MS').mean()

In [ ]:

y.plot(figsize=(15, 6))
plt.show()

In [ ]:
p = d = q = range(0, 2)
pdq = list(itertools.product(p, d, q))
seasonal_pdq = [(x[0], x[1], x[2], 12) for x in list(itertools.product(p, d, q))]

1) ARIMA Algorithm
In [ ]:
mod = sm.tsa.statespace.SARIMAX(y,
order=(1, 1, 1),
seasonal_order=(1, 1, 0, 12),
enforce_stationarity=False,
enforce_invertibility=False)
results = mod.fit()

In [ ]:
pred = results.get_prediction(start=pd.to_datetime('2017-01-01'), dynamic=False)
pred_ci = pred.conf_int()
ax = y['2014':].plot(label='observed')
pred.predicted_mean.plot(ax=ax, label='One-step ahead Forecast', alpha=.7, figsize=(14,
7))
ax.fill_between(pred_ci.index,
pred_ci.iloc[:, 0],
pred_ci.iloc[:, 1], color='k', alpha=.2)
ax.set_xlabel('Date')
ax.set_ylabel('Furniture Sales')
plt.legend()
plt.show()
In [ ]:
pred_uc = results.get_forecast(steps=100)
pred_ci = pred_uc.conf_int()
ax = y.plot(label='observed', figsize=(14, 7))
pred_uc.predicted_mean.plot(ax=ax, label='Forecast')
ax.fill_between(pred_ci.index,
pred_ci.iloc[:, 0],
pred_ci.iloc[:, 1], color='k', alpha=.25)
ax.set_xlabel('Date')
ax.set_ylabel('Furniture Sales')
plt.legend()
plt.show()

2) Prophet Algorithm
We will use Prophet algorithm to predict for furniture and for office supplies

In [ ]:
furniture = df.loc[df['Category'] == 'Furniture']
office = df.loc[df['Category'] == 'Office Supplies']
In [ ]:
cols = ['Row ID', 'Order ID', 'Ship Date', 'Ship Mode', 'Customer ID', 'Customer Name',
'Segment', 'Country', 'City', 'State', 'Postal Code', 'Region', 'Product ID', 'Category'
, 'Sub-Category', 'Product Name', 'Quantity', 'Discount', 'Profit']
furniture.drop(cols, axis=1, inplace=True)
office.drop(cols, axis=1, inplace=True)
furniture = furniture.sort_values('Order Date')
office = office.sort_values('Order Date')
furniture = furniture.groupby('Order Date')['Sales'].sum().reset_index()
office = office.groupby('Order Date')['Sales'].sum().reset_index()
furniture = furniture.set_index('Order Date')
office = office.set_index('Order Date')
y_furniture = furniture['Sales'].resample('MS').mean()
y_office = office['Sales'].resample('MS').mean()
furniture = pd.DataFrame({'Order Date':y_furniture.index, 'Sales':y_furniture.values})
office = pd.DataFrame({'Order Date': y_office.index, 'Sales': y_office.values})
store = furniture.merge(office, how='inner', on='Order Date')
store.rename(columns={'Sales_x': 'furniture_sales', 'Sales_y': 'office_sales'}, inplace=
True)
store.head()
Out[ ]:

Order Date furniture_sales office_sales

0 2014-01-01 480.194231 285.357647

1 2014-02-01 367.931600 63.042588

2 2014-03-01 857.291529 391.176318

3 2014-04-01 567.488357 464.794750

4 2014-05-01 432.049188 324.346545

In [ ]:
from fbprophet import Prophet
furniture = furniture.rename(columns={'Order Date': 'ds', 'Sales': 'y'})
furniture_model = Prophet(interval_width=0.95)
furniture_model.fit(furniture)
office = office.rename(columns={'Order Date': 'ds', 'Sales': 'y'})
office_model = Prophet(interval_width=0.95)
office_model.fit(office)
furniture_forecast = furniture_model.make_future_dataframe(periods=36, freq='MS')
furniture_forecast = furniture_model.predict(furniture_forecast)
office_forecast = office_model.make_future_dataframe(periods=36, freq='MS')
office_forecast = office_model.predict(office_forecast)
plt.figure(figsize=(18, 6))
furniture_model.plot(furniture_forecast, xlabel = 'Date', ylabel = 'Sales')
plt.title('Furniture Sales');

INFO:numexpr.utils:NumExpr defaulting to 2 threads.


INFO:fbprophet:Disabling weekly seasonality. Run prophet with weekly_seasonality=True to
override this.
INFO:fbprophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to ov
erride this.
INFO:fbprophet:Disabling weekly seasonality. Run prophet with weekly_seasonality=True to
override this.
INFO:fbprophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to ov
erride this.

<Figure size 1296x432 with 0 Axes>


In [ ]:
plt.figure(figsize=(18, 6))
office_model.plot(office_forecast, xlabel = 'Date', ylabel = 'Sales')
plt.title('Office Supplies Sales');

<Figure size 1296x432 with 0 Axes>

You might also like