0% found this document useful (0 votes)
19 views6 pages

10 1109@icpc2t48082 2020 9071463

Uploaded by

adriantam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views6 pages

10 1109@icpc2t48082 2020 9071463

Uploaded by

adriantam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

2020 First International Conference on Power, Control and Computing Technologies (ICPC2T)

Sales Forecast for Amazon Sales with Time Series


Modeling
Balpreet Singh Pawan Kumar Dr. Nonita Sharma
Dept. Computer Science and Engineering Dept. Computer Science and Engineering Dept. Computer Science and Engineering
Dr. B R Ambedkar National Institute of Dr. B R Ambedkar National Institute of Dr. B R Ambedkar National Institute of
Technology, Jalandhar Technology, Jalandhar Technology, Jalandhar
Punjab, India Punjab, India Punjab, India
singh.balpreet@hotmail.com pawankt10@outlook.com nsnonita@gmail.com

Dr. K P Sharma
Dept. Computer Science and Engineering
Dr. B R Ambedkar National Institute of
Technology, Jalandhar
Punjab, India
sharmakp@nitj.ac.in

Abstract—Accurate sales prediction plays an important role to forecast future quarterly net sales [2]–[5], which can help
in reducing costs and improving customer service levels, espe- Amazon prepare for future Black Fridays and Cyber Mondays.
cially for B2C(Business to consumer) e-commerce. This paper
The company earns around 5% of its total revenue generated
attempts to forecast future sales at Amazon.com, Inc. based on
historical sales data. Firstly, it proposes three possible forecasting for the whole year during these special online shopping
approaches according to the historical data pattern, that is Holt- festivals. As a result of this, it is considered a major parameter
Winters exponential smoothing, neural network auto regression to predict overall sales. Predicting sales on such days is
model and ARIMA(Autoregressive integrated moving average). particularly exigent, as there is generally huge spike (i.e.
Secondly, it specifies certain accuracy measures using which well
anomalies) relative to normal working days. For example, total
determine the suitability of the forecast methods on the available
sales data. Finally the three methods will be implemented to revenue generated on Black Friday is generally more than 10
forecast Amazons quarterly sales in 2019. The results can help times of the median sales of the year.
Amazon well manage its future operations. In this paper, we will try to use different forecasting method-
Index Terms—Forecasting, ARIMA, Holt-Winters Exponential ologies to forecast Amazons future quarterly net sales, based
Smoothing, Neural Network Auto Regression Model
on its historic quarterly data. In particular, we use ETS (Error
Trend & seasonality) that applies the Holt- Winters exponential
I. I NTRODUCTION
smoothing model multiple times with varying parameters,
When it comes to retail companies, it is important to ARIMA (Autoregressive integrated moving average) [6], [7]
record and maintain sales-related measures like number of and neural network autoregression model [7]–[10]. First, we
transactions, page hits, and revenue generated with respect to will use three different methods to forecast the quarterly net
time & then use this data to accurately predict values for the sales in 2019 [7]. Then we will test and justify different
above mentioned parameters in order to efficiently plan for approaches by comparing forecasting data with the actual
future scenarios [1]. Our work is based on the revenue data net sales in 2019, and find the best approach suitable for
of Amazon.com, Inc., a US based online retail Company that forecasting future quarterly net sales at Amazon. In order
focuses on e-commerce, cloud computing, digital streaming to measure the accuracy of predictions made by a particular
and artificial intelligence. As online sales are rising at an im- model, well use measures such as MAPE(Mean absolute
mense rate, consistent projection of sales allows the company percentage error) [11], RMSE(Root mean square error) [12]
to efficiently prepare for unexpected situations like web traffic and many more.
spikes or sudden disturbances to product stock. In the following sections we briefly explain the proposed
For an online retailer, the special online shopping festival, models, a general architecture explaining various stages in
such as Black Friday and Cyber Monday in USA, and NOV the forecasting process and then provide the comparative
11 in China, makes up a substantial percentage of sales. This performance of these models.
is especially true for Amazon.com. In order to provide better
service during the holiday shopping season, Amazon hired a II. M ETHODOLOGY
lot of temporary staff and extra permanent employees. During
this season, it is a major challenge for Amazon to allocate In this section, we provided details regarding the proposed
resources such as employees, the third part logistics and items, models that are used to generate quarterly sales forecasts along
to attain higher customer satisfaction. Therefore, it is useful with a general process flow on how well be applying these

978-1-7281-4997-4/20/$31.00 ©2020 IEEE 38


2020 First International Conference on Power, Control and Computing Technologies (ICPC2T)

models and what are the various transformations that the data B. Holt-Winters exponential smoothing
will undergo. Exponential smoothing forecast methods base their predic-
tions on weighted sum of past observations. Here, instead
of assigning equal weightage to each past value, we assign
A. Process flow
weights in an exponentially decreasing order. It is used to
Firstly, Amazon’s quarterly sales data is gathered and im- perform forecasting on uni variate time series data [14].
ported into R studio’s working environment. Then the data is Holt-Winters exponential smoothing is an extension of
transformed into time series format and any inconsistencies Exponential Smoothing that in addition to accommodating
pertaining to it are dealt with like missing values, noise etc effects of trend, also incorporates seasonality present in the
[3]. time series data. It includes three smoothing parameters:-
• Alpha - The value of alpha ranges between 0 & 1. It spec-
After data transformation, in case of ARIMA [6], [13],
determine if the data is stationary and if not, make it stationary ifies the proportion of weightage assigned to immediate
so that it is suitable for applying the ARIMA Model. For past value and (1 - alpha) is the weightage assigned to the
other models under consideration, no further pre-processing rest of the historical data. Large values indicate that we
or the need of stationarity is required. Once all the models are depend more on recent past values whereas small values
applied, determine the accuracy of the predictions made by indicate that more attention is given to historical data.
• Beta - It is used to dampen the effects of trend.
different models and compare as to which fits the data better.
• Gamma - It is used to dampen the effects of seasonality.

If trend and seasonality change linearly, we can model it


in an additive manner or else if the trend and seasonality
present in data change exponentially, they can be modelled
in a multiplicative manner.
C. ARIMA
ARIMA stands for AutoRegressive Integrated Moving Av-
erage [7], [15]. It incorporates the principles of the simpler
AutoRegressive method and Moving Average method. To this
it adds the concept of integration.
The name of the model itself captures the main concept on
which the model is based. These are:-
• AR: Autoregression. It predicts future values based on p
number of past values often referred to as lags.
• I: Integrated. ARIMA works on stationary data i.e the
mean, variance and covariance of the data needs to
be time invariant. The concept of integration refers to
differencing of data points i.e. subtracting a data point
with its immediate predecessor in order to make the time
series data stationary.
• MA: Moving Average. It is similar to AR with the notable
difference that instead of past values/ lags, we depend on
the associated error terms for these past values. We base
our prediction on q number of error terms.
The standard notation used to specify an ARIMA model is
of the form: ARIMA(p, d, q). The parameters of the ARIMA
model are defined as follows:-
• p: The number of past values/ lag included to make the
prediction.
• d: The number of times that the time series data needs
to be differenced in order to make it stationary.
• q: The number of past error terms included in our model,
also referred to as the size of the moving average bracket.
D. Neural network model
Fig. 1. Process flow diagram of forecasting. A neural network is an attempt to simulate/ model how a
human mind works [8], [9]. It can be visualised as a graph

39
2020 First International Conference on Power, Control and Computing Technologies (ICPC2T)

where each node is referred to as a neuron and edges connect there is seasonality present in a retailers sales, resulting from
a pair of neurons. An incoming edge to a neuron represents the holiday shopping season. Thus, it is important to collect
input along with certain weight assigned to that input and an data continuously. We obtained Amazons net sales of the
outgoing edge represents the output generated as a result of first quarter to the fourth quarter from 2005 to 2018, which
applying a non linear function on the weighted sum of all amounted to a total of 56 observations (actual net sales data).
its inputs. Given a suitable number of nonlinear processing There is seasonality present in the sales of Amazon. The
units, a neural network can approximate any complex target revenue generated during the 4th quarter were the highest
function in time by learning from its experience and that too as compared to other quarters for each year throughout the
with satisfactory accuracy. data. This is typical for a retailer since the fourth quarter
A Feedforward Neural Network is a widely adopted model encompasses the holiday shopping season that typically runs
used for time series forecasting applications. Fig. 2 shows from late November through the end of December. In addition,
a typical three-layered feedforward neural network in which sales have been increasing over time. The deseasonlized sales
nodes at the input layer acquire the past observations, perform chart shows that sales have been rising at an increasing rate.
certain processing and forward the results to nodes at the Based on the analysis of the data pattern, we can use Holt-
subsequent layer while the node at the output layer furnishes Winters exponential smoothing, neural network autoregression
the forecast for the future values. Nodes at the hidden layer are model and ARIMA (Autoregressive Integrated Moving Aver-
used for processing the data and applying certain non linear age model) to forecast Amazons quarterly net sales. We used
operations on them. R Studio to perform all of our analysis.
In this section we give a brief description of the dataset
in terms of data collection process, attributes present in the
data as well as the frequency at which the data is recorded
on a yearly basis. We also present the forecasting results
obtained by applying the above mentioned models along with
a comparative analysis based on certain accuracy measures [8].
A. Dataset Description
The sales data was obtained from the 10-k and Annual
report released by Amazon.com at the start of 2019. The
report details it’s financial performance. Every company in
the United States that trades publicly need to file this report
to the SEC (US Securities and Exchange Commission). We
obtained quarterly sales data from the first quarter of 2005 to
the fourth quarter of 2018 in Millions of US $. Fig. 3 shows
Fig. 2. Architecture of neural network model with single hidden layer. a snapshot of the dataset.

E. Accuracy measures B. Holt-Winters Exponential smoothing (Using ETS)


Whenever we make certain estimates or projections, there Holt-Winters Exponential Smoothing takes into account
are bound to be some error associated with it. Similar is the both trend and seasonal patterns of the data as the smoothing
case with any forecasting technique. In order to find out how process is applied [7]. In this model sales is chosen as the
good a prediction is, we need to have certain measures that dependant/ target variable, time is set as the independent vari-
indicate the accuracy of the results [12]. These measures check able and due to the presence of quarterly data, the seasonality
the difference between the actual values and predicted values. present is four.
In this paper we evaluate and compare the accuracy of a ETS(M, A, A) showed the most promising results. Here, M
forecasting technique based on MAPE [11] (Mean absolute indicates multiplicative level, A indicates additive trend & A
percentage error). It measures accuracy as a percentage, and indicates additive seasonality component in the order specified
can be calculated as the average absolute percent error for above. The Smoothing parameters of the above model are:-
each time period minus actual values divided by actual values. • α = 0.4485
Where At is the actual value and Ft is the forecast value, this • β = 0.2138
is given by (1):- • γ = 0.1852

The forecast results are shown in TABLE I below along


n
1  At − Ft with supporting graph in Fig. 3:-
M= | | (1)
n t=1 At C. ARIMA
III. R ESULTS AND D ISCUSSIONS In ARIMA model, a number of ARIMA models will be
In this paper, we will try to forecast the quarterly net sales in compared, and the model that produces random residuals with
2018 and then compare the actual and forecast data. Usually, the lowest RMSE for the 2005 to 2017 sales will be used to

40
2020 First International Conference on Power, Control and Computing Technologies (ICPC2T)

Fig. 4. Graph depicting the forecasting results obtained by applying ETS.

indicates that we are only dependant on the past two error


terms. In case of seasonal ARIMA, ARIMA(0, 1, 0)(0, 1, 0)
[16] was selected.
The forecast results for ARIMA & SARIMA are shown in
TABLE II & TABLE III respectively along with supporting
graphs in Fig. 4 & Fig. 5 below:-

TABLE II
ARIMA 2018

Period Forecast Lower 95% Upper 95%


(M USD) Limit Limit
Q1/2018 44630.88 42776.15 46485.60
Q2/2018 48308.01 46168.07 50447.96
Q3/2018 53051.29 50628.67 55473.90
Q4/2018 53895.73 51096.32 56695.15
Fig. 3. Quarterly Amazon sales data (2005-18).

TABLE I
ETS 2018 TABLE III
SEASONAL ARIMA 2018
Period Forecast Lower 95% Upper 95%
(M USD) Limit Limit Period Forecast Lower 95% Upper 95%
Q1/2018 43816.90 38568.81 49064.99 (M USD) Limit Limit
Q2/2018 45144.66 38407.84 51881.48 Q1/2018 43270.99 41389.48 45152.51
Q3/2018 48297.51 39348.48 57246.54 Q2/2018 45854.73 43717.48 47991.98
Q4/2018 58302.87 45124.29 71481.46 Q3/2018 52879.35 50513.85 55244.85
Q4/2018 58019.32 55445.74 60592.90

forecast the 2018 sales. From the data pattern, we can find
that the data is non-stationary because of the upward trend. D. Neural network autoregression model
Thus, we will transform the data to make it stationary. Then We constructed 20 feed forward networks. Each one of
the ARIMA model, with chosen p, q and d, can be used to them was a 2-2-1 network (i.e 2 independent variables trying
forecast the Amazon quarterly sales in 2018. to map to the dependent variable and 1 hidden layer with 2
ARIMA(1, 2, 2) was selected in order to make the necessary nodes) with 9 weight options. The final model was generated
predictions. Here, the parameter p = 1 indicates that we are by taking the average of these 20 networks.
only dependant on the past one value. The parameter d = The lagged values of the uni-variate time series are supplied
2 represents the differencing order which indicates that the as input to the neural network whereas the output generated
original series was non stationary & the parameter q = 2 is the predicted revenue data point. The neural network that

41
2020 First International Conference on Power, Control and Computing Technologies (ICPC2T)

Fig. 5. Graph depicting the forecasting results obtained by applying ARIMA. Fig. 7. Graph depicting the forecasting results obtained by applying NNAR.

E. Comparing results of the models applied


TABLE V contains value of Mean absolute percentage of
error calculated in order to see which model fits our data
better:-

TABLE V
MAPE

Model MAPE
ETS 3.501521
NNAR 4.663975
ARIMA 3.469872
SARIMA 2.884046

Following is the sales forecast(Million USD) for the year


2019 in TABLE VI:-

TABLE VI
SEASONAL ARIMA 2018
Fig. 6. Graph depicting the forecasting results obtained by applying SARIMA.
Period NNAR ETS ARIMA SARIMA
(M USD) (M USD) (M USD) (M USD)
Q1/2019 62670.08 61366.97 62405.49 60559.35
best fits our data was NNAR(1, 1, 2) [16] model. It indicated Q2/2019 63640.42 62808.49 64939.05 62403.35
that we have one input layer, one hidden layer with 2 nodes Q3/2019 65728.01 65709.06 67048.13 66093.35
and one output layer. It shows that we are only dependant on Q4/2019 69150.43 75518.32 69009.43 73847.44
the previous value to act as input to the model i.e lag = 1.
The forecast results are shown in TABLE IV below along Amazon on 25th April 2019 reported earnings of $59.7
with supporting graph in Fig. 6:- billion for the 1st quarter of 2019. In our study, seasonal
ARIMA predicted it to be $60.6 billion being the closest
TABLE IV to the actual projections as compared to other models under
NNAR 2018
consideration. This shows that the seasonal ARIMA model
Period Forecast (M USD) best fits the sales data and provides much better estimates/
Q1/2018 44912.42 forecast.
Q2/2018 46999.20
Q3/2018 51225.26 IV. C ONCLUSION
Q4/2018 54534.07
In this paper, we analyze three methods to forecast sales for
Amazon based on the historical data. The results show that

42
2020 First International Conference on Power, Control and Computing Technologies (ICPC2T)

seasonal ARIMA gives the most accurate results as compared


tot he other applied methods. Based on the forecasting results,
Amazon can have a big picture of the demand and then to
take relevant measures to arrange resources, such as hiring
more employees, storing more items or expanding shipping
capacity, and thus to offer good service to improve customer
satisfaction.
Though, the error percentage in forecasting (MAPE) [11] for
the applied methods is not that significant and can be applied to
the forecast of Amazon sales, there are still some obstacles to
using these methods as follows. One major obstacle impeding
the implementation of the forecast is the necessary data to
precisely carry out the forecast. Amazons quarterly sales
are influenced by many diverse factors, such as population,
disposable household income, interest rate, macroeconomic
trend and so on.
R EFERENCES
[1] R. A. Khan and S. Quadri, “Business intelligence: an integrated ap-
proach,” Business Intelligence Journal, vol. 5, no. 1, pp. 64–70, 2012.
[2] J.-h. YU and X.-j. LE, “Sales forecast for amazon sales based on
different statistics methodologies,” DEStech Transactions on Economics,
Business and Management, no. iceme-ebm, 2016.
[3] P. Sobreiro, D. Martinho, and A. Pratas, “Sales forecast in an it company
using time series,” in 2018 13th Iberian Conference on Information
Systems and Technologies (CISTI), pp. 1–5, IEEE, 2018.
[4] A. Ribeiro, I. Seruca, and N. Durão, “Sales prediction for a pharmaceu-
tical distribution company: A data mining based approach,” in 2016 11th
Iberian Conference on Information Systems and Technologies (CISTI),
pp. 1–7, IEEE, 2016.
[5] G. Nunnari and V. Nunnari, “Forecasting monthly sales retail time series:
A case study,” in 2017 IEEE 19th Conference on Business Informatics
(CBI), vol. 1, pp. 1–6, IEEE, 2017.
[6] P. A. Cholette, “Prior information and arima forecasting,” Journal of
Forecasting, vol. 1, no. 4, pp. 375–383, 1982.
[7] S. G. Makridakis, S. C. Wheelwright, and R. J. Hyndman, Forecasting
Methods for Management. Wiley, 2008.
[8] I. Alon, M. Qi, and R. J. Sadowski, “Forecasting aggregate retail sales::
a comparison of artificial neural networks and traditional methods,”
Journal of retailing and consumer services, vol. 8, no. 3, pp. 147–156,
2001.
[9] G. Zhang, B. E. Patuwo, and M. Y. Hu, “Forecasting with artificial neural
networks:: The state of the art,” International journal of forecasting,
vol. 14, no. 1, pp. 35–62, 1998.
[10] E. Bouding, Time Series Analysis: Forecast and Control. Prentice-
HallInc, 1994.
[11] Stephanie, “Mean absolute percentage error (mape).”
https://www.statisticshowto.datasciencecentral.com/mean-absolute-
percentage-error-mape/, 2017. Accessed: 2019-05-23.
[12] R. J. Hyndman and A. B. Koehler, “Another look at measures of forecast
accuracy,” International journal of forecasting, vol. 22, no. 4, pp. 679–
688, 2006.
[13] L. Zhou and X. Yong, “Electricity demand forecasting based on arima
model and linear neural network,” J. Ludong Univ, vol. 3, pp. 89–94,
2015.
[14] I. Naim and T. Mahara, “Comparative analysis of univariate forecasting
techniques for industrial natural gas consumption,” International Journal
of Image, Graphics and Signal Processing, vol. 10, no. 5, p. 33, 2018.
[15] R. J. Hyndman, “Automatic time series forecasting,” in Book of Ab-
stracts, p. 75, 2007.
[16] L. Bianchi, J. Jarrett, and R. C. Hanumara, “Improving forecasting for
telemarketing centers by arima modeling with intervention,” Interna-
tional Journal of Forecasting, vol. 14, no. 4, pp. 497–504, 1998.

43

You might also like