Optical Character Recognition Using Neural Network
Optical Character Recognition Using Neural Network
• The task at hand is to classify handwritten digits using supervised machine learning
methods. The digits belong to classes of 0 – 9.
• “Given a query instance (a digit) in the form of an image, our machine learning model
must correctly classify its appropriate class.”
Solution Approach-MLP
● Fully connected feed-forward network.
● Non-linear classifier
● Input: MxN, M samples, each of N attributes
● Batch (native) or sequential training (faster- preferred for
huge datasets.
● MLP can be applied to solve some difficult problems with
Error Backpropagation.
● MLP learns the relationship between linear and non-linear data.
K-Nearest Neighbour
• The k-Nearest Neighbors (k-NN) algorithm finds a group of k objects in the
training set that are closest to the test object and assigns a label based on the
predominance of a class in this neighborhood.
• The distance measured between the two characters images is needed in order
to use this rule.
• The rule involves a training set of positive and negative cases and a new
sample is classified by calculating the distance to the nearest one.
• When the amount of the pre-classified points is large, it is good to use a
majority vote of the nearest k neighbors instead of the single nearest neighbor.
Multiclass Perceptron
• Instead of being multiplied by a single weight vector (for a single class)
in a multi-class perceptron, the respecting feature vector is multiplied
(dot product) by a number of weight vectors (a separate vector of
weights for each unique class).
• The class to which the data belongs is determined by the weight vector
that produces the highest activation energy product.
• The Multi-Class Decision Rule is the name given to this decision-
making process.
Experiments- MLP
● Form of Feed forward neural network
● Model consists of
■ 4 Hidden Layers - where every node is connected to every other node in the next layer.
■ Input Layer
■ Output Layer
● ReLU activation Function in the Input and Hidden Layer
● Output Layer - Softmax Activation
● Optimizer -Adam Optimizer
● Categorical Cross entropy
Model Summary
• Model Summary of the MLP model.
Experiments -MCP
● Signum Activation Function
● Sigmoid Activation Function
■ Training Data : 10000
■ Learning Rate : 0.01
■ Number of epochs : 20
● Threshold unit is trained using Stochastic Gradient Descent
● Training data was non-linearly separable
● To avoid overfitting, we set number of epochs to train perceptron
Experiment -KNN
Two major things to be considered in KNN
1. Similarity Measure (Distance Measure)
2. Value of K
● Cosine Similarity Measure
● Euclidean Distance
● Optimal Value for K - Used Elbow Method
Result Analysis
• The below table shows the accuracy and prediction time comparison
between the three classifiers.
Results
● When we used 6 dense layers and four hidden layers and an output layer in
MLP, it provided us with the highest accuracy(98%) & loss (0.0057).
● During validation -Accuracy slightly reduced however loss increased(0.1102).
● For Multiclass Perceptron Classifier we observed it will mitigate this limitation
inefficiency such that its prediction time will be short because now it will only
compute the dot product in the prediction phase.
● Training cost is generally decreasing as epochs are increased (2000 epochs).
● K-NN stores all of the training data and compares its similarity to all of the
training data.
Results
● Its prediction time was also long.
● When the training size and testing size are changed to 4000 and 2000 the
optimal value of K for KNN with cosine similarity and KNN with
Euclidean distance changes.
● Thus, accuracy increased slightly with cosine similarity when the
training images are increased.
● In Euclidean distance validation error was minimized.
Result Analysis - MLP
• Accuracy in Train data
Result Analysis – MCP & K-NN
Limitations
• MLP with hidden layers have a non-convex loss function where there
exists more than one local minimum. Therefore different random weight
initializations can lead to different validation accuracy.
• MLP requires tuning a number of hyperparameters such as the number of
hidden neurons, layers, and iterations and it is sensitive to feature
scaling.
Conclusion
• When the time for the prediction phases of K-NN, Multilayer
Perceptron, and Multiclass Perceptron is calculated, the Multiclass
Perceptron clearly stands out with the shortest prediction time, whereas
K-NN took a long time to predict the test instances.
• The multilayer perceptron (MLP) classifier when compared to other
models provides the highest recognition accuracy for character
recognition, but it takes a long time due to its dense network connection.
Future Work
• Development of systems that can recognize on-screen characters and text
in various conditions in everyday life scenarios, such as text in captions
or text on signboards/billboards.
• Character recognition systems for languages other than widely spoken
languages, such as regional languages and endangered languages.
References
1. Pal, Anita, and Dayashankar Singh. "Handwritten English character recognition using neural
network." International Journal of Computer Science and Communication 1.2 (2010): 141-144
2. Younis, Khaled S., and Abdullah A. Alkhateeb. "A new implementation of deep neural networks
for optical character recognition and face recognition." Proceedings of the new trends in
information technology (2017): 157-162.
3. Roy S, Saravanan M. Handwritten character recognition using K-NN classification algorithm.
International Journal of Advance Research and Innovative Ideas in Education. 2017;3(5):1245-
50.
4. Ebrahimzadeh R, Jampour M. Efficient handwritten digit recognition based on histogram of
oriented gradients and SVM. International Journal of Computer Applications. 2014 Jan 1;104(9).
5. Matei, O., Pop, P. C., & Vălean, H. (2013). Optical character recognition in real environments
using neural networks and k-nearest neighbor. Applied intelligence, 39(4), 739-748.