Patel Prince Vipulbhai Thesis 2021
Patel Prince Vipulbhai Thesis 2021
A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science
in Software Engineering
By
August 2021
The thesis of Prince Vipulbhai Patel is approved:
_______________________________________ _________________
_______________________________________ _________________
_______________________________________ _________________
ii
Table of Contents
Signature page ii
Abstract iv
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1 SVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 LSTM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4 Creating data-set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5 Implementing algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
iii
ABSTRACT
By
Stock market trading has gained popularity in today's world with the advancement in
technology and social media. With the help of today’s technology we can aim to predict the stock
market for the future value of stocks. To make informed predictions, time series analysis is used
by most stock brokers around the world. This paper explains and analyzes the prediction of a
stock by using machine learning. In this paper, I propose a machine learning approach that will
be trained from the available stock data by using acquired knowledge for a prediction with
accuracy. In this context, the study will use a machine learning technique called Support Vector
Machine (SVM) and Long Short term memory (LSTM) to predict stock prices.
iv
Chapter 1
Introduction
1.1 Background
The stock market is a collection of markets and exchanges where there is a facilitation of
selling, buying, and dispatching of shares for public companies. These financial ventures are
formal institutionalized exchanges, also known as an over the counter marketplace operated
under a defined set of strict regulations. The stock market is also a platform for buyers and
sellers to interact and form transactions. Benefits of the stock market are a controlled and
monitored environment for hundreds of thousands of people to buy and sell shares with
transparency and fair pricing practices. There are many vital aspects the stock market considers
such as efficient pricing, fair deal security transactions, maintenance of liquidity, valid
On other hand, we have the technical analysis portion of the market analysis of stocks. This
happens by studying the statistics that are generated by market activity which includes factors
such as past prices and volumes. In technical analysis, technical traders obtain information
needed for trading from charts. This differs from fundamental analysis where fundamental
traders consider the factors outside of price fluctuations from the asset itself.
In more recent times there has been an increasing dominance of machine learning in different
industries that have provided some resourceful information to an innumerable amount of traders
which can apply machine learning techniques to the field. A multitude of these learning
1
1.2 Statement of Problem
The forecasting of stocks indicates that it is a very strenuous process because of the market
volatility that depends upon an accurate forecast model. The stock market inconsistencies tend to
fluctuate which does make an impact on an investor's decisions. The stock market prices are
increasingly dynamic and susceptible to rapid changes. This is because the underlying aspects of
the financial domain can also be attributed to the mix of the various unknown parameters. These
parameters include previous day closing prices, P/E ratio and the other unknown factors such as
the economy, election results, and rumors. There have been numerous attempts to predict stock
prices using machine learning. The focus of each research method can vary in three distinctive
ways:
1- The targeted price change can be near-term which is less than a minute, short-term,
meaning tomorrow to a few days later, and lastly long term which is considered months later.
2- The set of stocks in a particular industry then branching to all general stocks.
3- The predictors used can range from a global news and economy trend, to particular
characteristics of the company, and lastly to purely time series data of the stock price.
The probable stock market prediction target can include the future stock price, the volatility of
the prices, and market trend. In the stock market prediction there are two types of predictions that
include a dummy prediction or a real time prediction which is used in the stock market prediction
system. In dummy predictions, there is a defined set of rules to predict the future price of shares
by calculating the average price. In the real time predictions, there is a compulsory use of the
2
1.3 Approach
This paper will analyze financial data predictor programs in which there will be a dataset
storing all historical stock prices and this data will be treated as a training set for the program.
The main purpose of the prediction is to reliably and more accurately reduce the amount of
uncertainty associated with investment decision making in the stock market. The accuracy
achieved gives an idea about the stock's future price based on the prediction made by the
algorithm.
The stock market arguably follows the Random Walk Theory, this implies that in stock market
prediction tomorrow's possible stock market value is also today's value. The Random Walk
Theory is defined as the influxing changes of stock prices that have similar distribution rates and
are independent from one another. This means assuming that the past trends and movements of
the market stock prices cannot be used to predict other future movements. The Random Walk
Theory suggests that since stocks are unpredictable, all methods of predicting stock prices in the
long run are futile because of the assumed additional risks. Random Walk Theory was
popularized in 1973 by Burton Malkiel, a Princeton economist. Since then, more sophisticated
computer algorithms are utilized to identify and exploit the different trends of stock market
prices. Even if the trends spotted last for a small amount of time, like a second, their existence
3
Chapter 2
Theory
2.1 SVM
Support vector machines are useful for relapse issues and arrangements. It is a learning model
that is impacted by the idea of hyperplanes as choice limits. In the two dimensional space of the
hyperplane is a line. When the model is being built, support vector machines create calculations
for new guides while allocating a hyperplane to a yield and classes. Support vector machine
models are an example of space models created with an emphasis that various classes are a
greatest separation from choice limits. By using a five crease cross approval procedure to
analyze estimates, support vector machines are more powerful allowing for higher
measurements. This analysis makes the algorithm effective in machine learning. An example
would be when the quantity of measurements used surpasses the quantity of tests. Support vector
machines have a higher cost and are more accurate because of parameters using gamma, an
accessible svm classifier, and regularization parameters. The parameters choose the amount of
misclassification that is allowed, rbf or poly decides the hyperplane learning, and gamma shows
4
Figure 2.1 The svm decision making boundary
- Effective in high dimensional spaces. SVMs are very good when we have no idea on the
data. Works well with even unstructured and semi structured data.
- If the number of dimensions is greater than the number of samples, it is still effective. It
- Uses a set of training points in the decision function called the support vectors so it is
- It is versatile meaning different kernel functions can be specified for the decision
function. Common kernels are provided but it is also possible to specify custom kernels.
5
The disadvantages of support vector machine are :-
- If the number of features is much greater than the number of samples, avoiding
- SVMs do not provide probability estimates, these are calculated using an expensive five
- Since the final model is not so easy to see, we cannot do small calibrations to the model
6
2.2 LSTM
A key difference LSTM has versus traditional RNN’s is a gating mechanism. An example that
demonstrates the difference in LSTM’s capability to save long term memory. This can provide
benefits in sequential tasks or natural language processing. This can be demonstrated when
there’s network generated text given for an analysis. Then the RNN model has to generate the
next word. Network generated text is predicted when information is given, such as a name or
object and the model then generates the same model or word again when later predicting the next
word.
Since RNN’s generally have an issue with having short term memory the average RNN is only
able to utilize information from the text that was shown within the previous few sentences. The
This differs from a LSTM that is able to store information from earlier periods of time that
LSTM cells have beneficial mechanisms that act as a gate. In the average RNN cell there is a
tanh activation function in order to gather output and a new hidden state. This is different from
LSTM in that it has a more complicated arrangement. This is because the LSTM cells need three
pieces of information which are the current input data, the short term memory from the prior cell
7
Short term memory can also be called a hidden state and long term memory can be referred to as
a cell state.
The cell uses the function of a gate to generate saved information or passed up at each time step
before transferring short term and long term information to another cell.
These gates are supposed to carefully pick and get rid of any information that’s irrelevant. This
means the gates store beneficial information. These gates are known as the Forget Gate, the Input
Gate, and the Output Gate. Each gate has their own mechanism to differentiate them.
Input Gate:-
The input gate decides what information will be stored in long term memory. It can only work
with the information provided by current input and short term memory from the previous time
step. This is why it then has to refine beneficial information from information that isn’t
beneficial.
There’s two ways for this to happen, the first way is a method to pick information that decides
discarded information or stored information. To produce this process we can bypass short term
memory and current input to a sigmoid function. The sigmoid function transforms values
between zero and one with zero demonstrating the unimportance of information and one
demonstrating usable information. This is favorable in deciding what values should be used and
saved. The benefits of the sigmoid function having weights is discarding the more unimportant
8
The second method uses current input and short term memory to trigger an activation function
Next, the output methods are multiplied and the final solution will show the information that will
Forget Gate:
The forget gate determines the long term memory information that will be stored or rejected. By
multiplying the received long term memory from a forget vector created by short term memory
and current input. Similar to the first method using the Input gate, the forget vector is also
particular and selective in picking information. When gathering information for the forget vector,
the current input and the short term memory pass between a sigmoid function. The process is
very similar to the Input Gate and can also be described as consisting of a scale of zero to one
that is multiplied with the factor of long term memory, influencing the retained parts of long term
memory information.
The outputs given from the Input gate and Forget gate receive a pointwise inclusion that creates a
new kind of long term memory that gets passed onto the next cell. The new long term memory is
9
Output Gate:-
The output gate uses previous short term memory, new long term memory, and current input to
create a new type of short term memory which will be given to the cell in the time step after. The
first step that happens is current input and short term memory are given to the sigmoid function
with different weights to create the last filter. The new long term memory is then put through an
activation function. The output gathered from these two processes gets multiplied to create the
The short term and long term memory created gets transferred over onto the next cell to continue
the process as it repeats. The product and output of each time step can be derived from the short
term memory.
- It can model a collection of records so that each pattern can be assumed to be dependent
- It is also used with convolutional layers to extend the powerful pixel neighbourhood.
- It cannot system very lengthy sequences if the usage of Tanh or Relu as an activation
feature.
10
Figure 2.2 Working of LSTM
11
Chapter 3
Methodology
● Step1- Find the data from the internet. The preferred website is yahoo finance for the
uniformity of data. The goal is to download the data from the internet. We are predicting
the financial market value of any stock. The share value up to the closing date is then
● Step2- The data that is downloaded from the yahoo finance app can be downloaded in the
form of a csv file. The values (comma separated values) of any stocks can be loaded in
the algorithm.
● Step3- The data in the csv file is modified to the desired format so that the algorithm can
process it.
● Step4- The algorithms are then initiated in the terminal separately. The two algorithms
are coded separately making them able to be run separately for different data.
● Step5- The code that is selected will run and in the end will output all the accuracy
values. The mean accuracy and standard deviation is printed after all the accuracies are
calculated.
12
Chapter 4
Creating data-set
For training and testing predictive algorithms, a database is compiled for training and testing
purposes. The database should be having minimum deviation and should account for normal
behaviour. It was important to select a time frame where the stock market did not show any
sudden jump or dip in prices. Often there are times of economic collapse in certain sectors,
industries and countries. The extent of these collapses might not be foreseen or predictable.
For the training and testing of the algorithms and models, a time frame of January 2009 to
October 2018 was chosen. From 2007-2008 the stock market took a hit from the financial crisis.
Yahoo Finance provides a Historical Database which can be utilized for the training of the
model. It provides a historical database of a stock also using the following parameters:
● Adj Close: Closing value of the stock before market opens next day, adjusted by
● Volume: Total number of shares that changed hands during that day
13
Fig 4.1 Creating the dataset
The dataset creation is an important step because the data that is fed to the algorithm
needs to be in the format that can be processed. It is a separate python file named
DatasetCreation.py. Data Creation makes that dta more organized and refined while also
being able to cut down the amount of time needed to process the output.
14
Figure 4.2 Code using the dataset
The collected stock price historical data from Yahoo Finances for the following 9 companies
● Apple
● Amazon
● Microsoft
15
Figure 4.3 Screenshot of apple csv file from yahoo finance
16
Using the database to compile information derived from these parameters. The key terms used in
● Momentum: It is the measure of detecting the flow of a parameter such as stock price in
our case. It is defined as +1 if the price has increased and -1 if the price has been dropped
from yesterday.
● Volatility: It determines the fluctuations in the same parameter at different times. For our
case it will be determined by the difference between yesterday’s price and today’s price
Hence for the input dataset the following parameters were found out from the above parameters:
● Sector Momentum: This parameter considers other companies in the same sector and
17
● Stock Momentum: This is average of the last 5 days of momentum of the respective
company
● Stock Price Volatility: This is the average of the last 5 days of stock price of the
respective company.
18
Fig 4.6 Date, Close, Change, Momentum used in the algorithm for the stock’s data
19
Chapter 5
This algorithm was implemented in python using the scikit-learn library for machine learning.
20
Fig 5.2 SVM algorithm implementation with the libraries used
21
In this case, since the stock market data is non linear, the rbf kernel was used and kept the
hyperparameters C and gamma at their default value of 1 and scale respectively. Higher values of
C and gamma, causes the decision boundary to become more curvy and variance to increase
thereby causing higher chances of overfitting. Accuracy can be improved by tuning the
parameters using the function gridSearchCV() in sklearn.model selection. This may take a lot of
time depending on the number of training samples. Once the classifier is trained by mapping the
given features and labels, prediction is made using the test data which consists of approximately
20% of the total data. This is compared with the target data present in the output file and
accuracy is ascertained.
22
SVM implementation is done using the Scikit Learn Library. This is an inbuilt function
for machine learning and preprocessing steps like data normalization. This is the key element
that makes the model functional with the help of machine learning. Python language is used to
import the library which helps with the training of the datasets. The test dataset is used to which
23
Figure 5.4Features used by SVM
24
5.2 Long Short Term Memory(LSTM) implementation
● A value is defined for time step. Using this time step the data is shifted
and these two series are concatenated to get the output set.
● LSTM works on the data which is within the scale of activation function
of the network.
(tanh) which has output range from −1 to +1 which is ideal for time series
data.
- Model Development:
● LSTM is a type of Recurrent Neural Network (RNN). This type of neural network
is useful when remembering over a long sequence of data and it doesn't depend on
25
● Time steps: These are time steps of the parameters for the input dataset.
● Features: The parameters considered and observed for the input dataset to
While compiling the network we need to mention loss function and also the optimization
algorithm. So, we are using mae as loss function and ADAM as optimization algorithm. ADAM
will select a suitable learning rate for the network. And then this model is fit to the training data
and we predict the output. Then, we calculated the accuracy of the model.
26
Fig 5.6 LSTM code flow
27
The LSTM algorithm here is using Numpy and Pandas for dataset manipulation, Scikit learn
for machine learning, Keras for neural network creation and Matplotlib for visualization. All
these are used in order to process the data and output the results.
28
Chapter 6
To make an accurate comparison between the two algorithms used, the same algorithms were
run multiple times. The test data is fed in the algorithm which runs each of the files separately.
The simplified test data is used which is processed and Each algorithm is implemented 30 times
which gives out an accuracy for each time. The mean accuracy and standard deviation is
calculated at the end. These numbers give a better understanding of the overall result by
narrowing down the result to one number which eventually helps to compare the two algorithms
used. For each run different training and testing dataset is used. The results for the two
For 30 runs of SVM algorithm, the algorithm got approximately 50.73 mean accuracy with
0.308 standard deviation. This shows that SVM performance is consistent for 30 runs; this is due
to the nature of the SVM algorithm. Algorithm will keep training until it can classify maximum
testing data resulting in higher accuracy. The SVM code predicts the value of the stock for the
next 30 days and then prints out a separate accuracy. All the accuracies in the end are taken into
For other numbers of runs the algorithm performs similarly and the accuracy is not affected
much by increasing the number of times the algorithm runs. This tells that the accuracy is
29
The accuracy is relatively low compared to LSTM’s accuracy. It is because the dataset used
has higher variability. Also the design of the algorithm is such that the accuracy is less effective
if the dataset is similar to the one used. With the result of the accuracy achieved, the percentage
The standard deviation is relatively low but a little higher than LSTM in this particular
prediction for this dataset chosen. This means the fluctuation from the average returns is higher.
Since the values of the standard deviation change with the change in the data set, this data set
results in less fluctuation from the average values. It is considered to be relatively low which
means the prediction fluctuation is less and hence the results are accurate.
30
Figure 6.1 SVM results
31
6.2 LSTM result
For 30 runs of LSTM algorithm the accuracy was 62.29 which is the mean accuracy with
standard deviation of 0.256. The fluctuation is minimal proving the algorithm is reliable and
there is overall performance improvement compared to SVM. The accuracy is calculated in the
similar way, predicting the accuracy for each stock for the next 30 days and then the mean
accuracy is calculated.
As you can see LSTM performs well compared to SVM as it uses powerful methods to predict
the sequence as it can store the past data. This is beneficial in the stock market prediction
because the stock data has values based upon which the future value of the stock is predicted.
Stock market prediction is a time series problem as mentioned before, so the LSTM algorithm
can be used when the problem is viewed from a time series perspective. A traditional
RNN(Recurrent neural network) has a problem of vanishing gradient point, whereas LSTM does
not have that problem that helps it to function better for the time series problem that has been
tried to solve.
The Standard deviation achieved here is relatively lower making the returns fluctuate less
from the average returns. LSTM has better results overall. According to the results that have
been displayed here, the integration of this algorithm using python works effectively and
32
Figure 6.2 LSTM results
33
Chapter 7
Conclusion
This paper observes that machine learning can be used to predict stock market prices with
accuracy. The result indicates historical data can be used to predict stock movement with
reasonable accuracy but the choice of algorithm depends on the requirement of parameters like
time, standard deviation and mean accuracy. For this implementation, the factors that affect stock
performance were incorporated. If a higher number of factors are used and after adequate
preprocessing and filtering of data, it is used to train the network model, then a higher accuracy
can be achieved. It is concluded by the results that LSTM works better in this case. Although
both of the algorithms have both pros and cons, each one can be used differently based on the
The implementation of SVM and LSTM using moving averages are done separately. For
future works, intraday prices can also be used to compare the values and to understand the
volatility of the stock, crude oil and gold prices in a better manner. The stock selling and buying
data can also be used to understand how the stock price and external factors such as a surge and
dip influence the buying and selling pattern. The model can be trained for volume deduction
meaning the stock prices which can be sold or purchased in a way which results in profit from
the stock This will help in developing a more accurate prediction. The models can also be
extended to provide live interactive predictions based on the user given data and then be used for
other forecasting problems like weather forecasting, disease forecasting and house price
forecasting. Some other factors that impact the stock prices can be included that can encourage
more accurate prediction eventually leading to increasing the chances of profits from the stock.
34
The ranges of the experimental samples are not large enough. Large data sets would give
higher and more accurate predictions but at the same time would require computers that can
35
References
Algorithms: Backpropagation, SVM, LSTM and Kalman Filter for Stock Market," 2019 Amity
10.1109/AICAI.2019.8701258.
[2] Han, Shuo and Chen, Rung-Ching (2007) "Using SVM with Financial Statement Analysis
for Prediction of Stocks ," Communications of the IIMA: Vol. 7 : Iss. 4 , Article 8. Available at:
[3] Hegazy, Osman, et al. “A Machine Learning Model for Stock Market Prediction.”
[5] Le, Xuan Hien & Ho, Hung & Lee, Giha & Jung, Sungho. "Application of Long Short-Term
github.com/LorranSutter/PredictStock-SVM.
github.com/ashutosh1919/Stock-Prediction-using-LSTM.
[9]Ck, Vignesh. “Pdf.” MCA Scholar, Department of MCA, School of CS & IT,Jain
36
[10]Loye, Gabriel. “Long Short-Term Memory: From Zero to Hero WITH PYTORCH.”
blog.floydhub.com/long-short-term-memory-from-zero-to-hero-with-pytorch/
37