0% found this document useful (0 votes)
19 views37 pages

Stock Price Prediction: A Comparative Study Between Traditional Statistical Approach and Machine Learning Approach

Uploaded by

niranjanarun1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views37 pages

Stock Price Prediction: A Comparative Study Between Traditional Statistical Approach and Machine Learning Approach

Uploaded by

niranjanarun1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 37

4th International Conference on Electrical Information and Communication Technology (EICT)

Stock Price Prediction: A


Comparative Study between
Traditional Statistical Approach and
Machine Learning Approach
Authors:
1 1. Indronil Bhattacharjee
Paper Id : 201
Department of CSE, KUET

2. Pryonti Bhattacharja Presented by : Indronil


Bhattacharjee
Department of Economics, SUST
2 Outlines

 Introduction
 Objectives
 Methodologies
 Proposed System
 Results and Discussions
 Conclusion
 References
3 Introduction

 Stock market is an aggregation stockbrokers and traders who can buy


and sell shares of stocks.

 Stock data is non-stationary, chaotic, random and depends on several


technical parameters.

 Since statistical approaches are linear in nature, it hampers prediction


performances in case of sudden rise or fall of prices of the stocks.

 In modern days of artificial intelligence, machine learning plays an


important role in time series predictions.
4 Objectives

 To predict stock prices using both statistical and machine learning


approaches.

 To make comparison between statistical and machine learning approach


predictions.

 To find a better approach, which predicts prices of the stocks more


accurately
5 Methodologies

Statistical Methods Machine Leaning Methods


 Regression (Simple Linear,
 Simple Moving Average
Lasso, Ridge)
 Weighted Moving Average  K-Nearest Neighbour
 Random Forest
 Exponential Smoothing
 Support Vector Machine
 Naïve Approach (Last Value (Support Vector Regression)
Method)  Neural Network Models (Single
Layer Perceptron, Multilayer
Perceptron, Long Short Term
Memory)
6 Simple Moving Average

 An unweighted mean of a specific number of previous data is considered


to be the predicted value for the next day.

 The formula used for SMA-

In (1),
= Predicted closing price for th
day
= Actual closing price at th
day
= Number of days considered for prediction
7 Weighted Moving Average

 The difference between SMA and WMA is that a weight is used with the
previous values to predict the future value.
 The formula used for WMA-

= Weight used for ith day


= Unit weight
Fig. Weights for th
last day of 15-day WMA
8 Exponential Smoothing

 An smoothing constant, is used for smoothing the prediction value from


the previous prediction.

 This smoothing constant maximizes prediction accuracy from the last


prediction.

 The formula used for exponential smoothing-

Here,
= Smoothing
constant
9 Naïve Approach

 Naive approach is also called the Last Value method.

 The last actual value is considered as the predicted value for the next
day.

 The formula used for Naive approach-

In (6),
= Predicted closing price for th
day
= Actual closing price at th
day
10 Regression
(Simple Linear, Lasso, Ridge)

 Simple linear regression algorithm is one of the fundamental supervised


machine learning algorithms used for regression.

 Ridge is one of the techniques which reduces model complexity and


prevents over-fitting.

 Lasso stands for Least Absolute Shrinkage and Selection Operator. Lasso
finds the central point where data values are shrunk.
11 K-Nearest Neighbour

 Algorithm to predict numerical target based on a similarity measure.

 KNN calculates the average of the numerical target of the K nearest


neighbors.
12 Random Forest

 Random Forest algorithm is one of the ensemble learning algorithms.

 It is an additive model that makes prediction by combining decisions of


a sequence of decision trees.

 Random forest prediction formula is as follows-

(8)

In (8),
Final prediction
= Decision function of th
decision tree
13 Support Vector Machine
(Support Vector Regression)

 SVM can be used as regression method.

 Support Vector Regression uses the same principles as the SVM for
classification, with only a few minor differences.

 A margin of tolerance (ϵ) is set in approximation to the SVM.

Fig. Support Vector Machine


14 Neural Network Models
(Single Layer Perceptron)

 Single Layer Perceptron model is a neural network model which consists


of only one input layer and one output layer.

 There will be no hidden neuron layer in between.

Fig. Single Layer Perceptron


15 Neural Network Models
(Multilayer Perceptron)

 There will be one or more hidden layers present in between one input
layer and one output layer in an MLP model.

(11)

In (11),
yk = Final output
φ = Activation function
θk = Threshold
n = Number of neurons

Fig. Multilayer Perceptron


16 Neural Network Models
(LSTM)

 LSTM is associated with Recurrent Neural Network.

 It introduces memory unit, forget gate and update gate with simple
RNN.

Fig. LSTM
17 Performance Measures

 Mean Squared Error (MSE)

 Mean Absolute Percentage Error (MAPE)


18 The Proposed System

Fig. Flow diagram of statistical methods

Fig. Flow diagram of machine learning method


19 Dataset

 We have used two datasets


(a) Stock Prices of Tesla
Start : 29-06-2010
End : 17-03-2017
Days : 1693

(b) Stock Prices of Apple


Start : 02-01-2014
End : 31-12-2018
Days : 1259 Data collected
from www.kaggle.com
20 Dataset (Continued)

 Each row represents the information of a single day.

 There are six columns for each row.

 1st column - the date

 2nd column - the opening price of that day

 3rd column - the highest price of that day

 4th column - the lowest price of that day

 5th column - the closing price of that day

 6th column - the volume of shares traded on that day.


21 Data Preprocessing

Data preprocessing includes-

 Checking out for missing values and discards those data from the
dataset

 Looking for categorical values

 Drop out unnecessary information in the dataset.


22 Data Splitting
The dataset has been split into two parts as training data and test data.
(a) For Tesla dataset,
Training Data (1200 days): 29-06-2010 to 06-04-2015
Testing Data (492 days): 07-04-2015 to 17-03-2017
(b) For Apple dataset,
Training Data (1000 days): 02-01-2014 to 19-12-2017
Testing Data (258 days): 20-12-2017 to 31-12-2018
23 Feature Selection

 For time series prediction, selection of features is an important task.

 Because selection of worst features can direct the prediction to a wrong


way.

 In this system, three features have been selected.

1. The opening price

2. The highest price

3. The lowest price.


24 Prediction

 As statistical methods, predictions have been performed using 10-day, 15-


day, 30-day Simple Moving Average and Weighted Moving Average,
Exponential Smoothing with α = 0.3, 0.5, 0.75 and naïve approach.

 As Machine Learning methods, predictions have been performed using


Simple Linear Regression, Lasso Regression and Ridge Regression, K-
Nearest Neighbor, Random Forest with different number of estimators,
Support Vector Machine and Neural Network Models like SLP, MLP and
LSTM.

 After predictions, MSE and MAPE values are calculated.


25 Results and Discussions

Table I. Performance Measures of Different Simple


Moving Average Methods
26 Results and Discussions (Continued)

Table II. Performance Measures of Different


Weighted Moving Average Methods
27 Results and Discussions (Continued)

Table III. Performance Measures of Exponential


Smoothing Method with Different Smoothing
Constants
28 Results and Discussions (Continued)

Table IV. Performance Measures of Naïve


Approach Method
29 Results and Discussions (Continued)

Table V. Performance Measures of Different


Regression Methods
30 Results and Discussions (Continued)

Table VI. Performance Measures of KNN


and SVM
31 Results and Discussions (Continued)

Table VII. Performance Measures of Random Forest


with Different Number of Estimators
32 Results and Discussions (Continued)

Table VIII. Performance Measures of Different


Neural Network Models
33 Conclusion

 A comparative study between statistical approaches and machine


learning approaches has been done in terms of prediction performances.

 Machine learning methods, especially, MLP and LSTM are found to be


the most accurate to predict stock prices.
34 References

1) R. S. Dhankar, Capital Markets and Investment Decision Making, 1st ed.


Springer India, 2019, ch. Stock Market Operations and Long-Run
Reversal Effect.
2) M. Usmani, S. H. Adil, K. Raza, and S. S. A. Ali, “Stock market prediction
using machine learning techniques,” in 2016 3rd International
Conference on Computer and Information Sciences (ICCOINS), Aug
2016, pp. 322–327.
3) H. Grigoryan, “A stock market prediction method based on support
vector machines (svm) and independent component analysis (ica),”
Database Systems Journal, vol. 7, no. 1, pp. 12–21, 2016.
4) S. Hansun, “A new approach of moving average method in time series
analysis,” in 2013 International Conference on New Media Studies,
CoNMedia 2013, Nov 2013, pp. 1–4.
5) E. Ostertagova and O. Ostertag, “The simple exponential smoothing
model,” 09 2011.
35 References (Continued)

6) Saptashwa, “Ridge and Lasso Regression: A Complete Guide with Python


Scikit-Learn,”https://towardsdatascience.com/ridge-and-
lassoregressiona-complete-guide-with-python-scikit-learn-e20e34bcbf0b,
Sep 26, 2018.
7) D. S. Sayad, “K Nearest Neighbors - Regression,” http://saedsayad.com/k
nearest neighbors reg.htm.
8) J. Patel, S. Shah, P. Thakkar, and K. Kotecha, “Predicting stock market
index using fusion of machine learning techniques,” Expert Syst. Appl.,
vol. 42, pp. 2162–2172, 2015.
9) D. S. Sayad, “Support Vector Machine - Regression (SVR),”
https://www.saedsayad.com/support vector machine reg.htm.
10)S. Rathor, “Simple RNN vs GRU vs LSTM :- Difference lies in More
Flexible control,” https://medium.com/@saurabh.rathor092/simplernnvs-
gru-vs-lstm-difference-lies-in-more-flexible-control5f33e07b1e57, Jun 2,
2018.
36
Any Questions
Please…?
37
THANK
YOU

You might also like