Deep Learning
Deep Learning
Deep learning is a subset of machine learning that uses neural networks with many layers to learn
complex representations of data. So, neural networks are a key component of deep learning.
Deep learning has shown impressive performance on a wide range of tasks, including:
Image classification
Deep Learning is gaining much popularity due to it’s supremacy in terms of accuracy when
trained with huge amount of data.
TYPES OF DEEP LEARNING
The most widely used architectures in deep learning are convolutional neural networks (CNNs),
and recurrent neural networks (RNNs).
Convolutional Neural Networks (CNNs) are specifically for image and video recognition
tasks. CNNs are able to automatically learn features from the images, which makes them
well-suited for tasks such as image classification, object detection, and image segmentation.
• Recurrent Neural Networks (RNNs) are a type of neural network that is able to process
SEQUENTIAL DATA, such as time series and natural language (NLP). RNNs are able to
maintain an internal state that captures information about the previous inputs, which makes
them well-suited for tasks such as speech recognition, natural language processing, and
language translation.
RECURRENT NEURAL NETWORKS
Thus input and output X and Y both length of sequence can be different
BEFORE START RECURRENT NEURAL NETWORKS –
NOTATION
Example: Steve Jobs and Steve Wozniak Began Apple Computer in 1976
NAMED-ENTITY RECOGNITION
X: Steve Jobs and Steve Wozniak began Apple Computer in 1976
= 10
Y: 1 1 0 1 1 0 0 0 0
= 10
………………………………………….
Vocabulary
WHY RECURRENT NEURAL NETWORKS ?
Issues
The inputs and outputs can be different lengths and different examples
It doesn't share features learned across different positions of texts
CALCULATE THE OUTPUT OF RECURRENT NEURAL NETWORKS
Gate controllers
Outputs 0 or 1
Forget Gate: which part of the long term memory should be removed.
Input gate: which part of the g should be added to the long term memory.
Output gate: which part of the long term state should be added to the long term state should be added
to the output time step .
ARCHITECTURES OF RNNs - GATED RECURRENT UNITS
Architectures
In gray scale image, the value of each pixel in Primary purpose of convolution in case of a CNN is to
the matrix from 0 to 255 extract features from the input image.
RGB image, each pixel will have the combined This layers parameters consist of a set of learnable filters.
values of R, G and B.
CONVOLUTIONAL NEURAL NETWORKS LAYERS
If we have an input of size W x W x D and Dout number of kernels with a spatial size of F with stride S
and amount of padding P, then the size of output volume can be determined by the following formula:
𝑾 − 𝑭 +𝟐 𝑷
𝑾𝒐𝒖𝒕 = +𝟏
𝑺
Where:
Stride: It is a hyperparameter that determines the step size of the convolutional filter as it moves
across the input image or the feature map.
Padding: in CNN stands for "zero-padding" or "zero-padding around the borders of an input
volume with zeros. Padding is a technique used in convolutional neural networks (CNNs) to
preserve the spatial dimensions of the input volume across convolutional layers.
CONVOLUTIONAL NEURAL NETWORKS LAYERS
Max-pooling: It discriminated against the maximal norm in each window that was
eventually covered by the filters.
Average pooling: takes the average value of a set of numbers in a small region of the
feature map and replaces the region with that value.
CONVOLUTIONAL NEURAL NETWORKS LAYERS
Optimizer: an algorithm that updates the weights and biases of the network during the
training process. The goal of the optimizer is to minimize the error between the predicted
output of the CNN and the actual output. Example: Stochastic Gradient Descent (SGD), Adam,
Hyperparameters
Number of Convolutional Layers: 2
Number of Filters: 32 in the first layer, and 64 in the second layer.
Kernel Size: 3x3 for both layers.
Stride: 1 for both layers.
Padding: 'same' padding is used to maintain the input size.
Pooling: Max pooling with a pool size of 2x2 is applied after each convolutional layer.
Learning Rate: 0.001
Dropout Rate: 0.25
Batch Size: 32
CONVOLUTIONAL NEURAL NETWORKS LAYERS
Flattening is converting the data into a 1-dimensional array for inputting it to the next layer.
CONVOLUTIONAL NEURAL NETWORKS LAYERS
Activation function: is often used to convert the output of the layer into a probability distribution
over the possible classes.
Example: if the output of the FC layer for a particular input is [2.0, 1.5, 0.5], applying the SoftMax
function would transform it into [0.446, 0.335, 0.219], which represents the probabilities of the input
belonging to each of the three possible classes.
CONVOLUTIONAL NEURAL NETWORKS
Regularization
Commonly used in deep learning models to prevent overfitting/under fitting during the training
model.
Early stopping: This involves monitoring the performance of the model on a validation set during
training and stopping the training process when the performance on the validation set begins to
degrade.
CONVOLUTIONAL NEURAL NETWORKS: Regularization
Data Augmentation
Transfer Learning
VGG16 - it has 16 layers depth and it has about 138 million (approx.)
Regulation
VGG16 - it has 16 layers depth and it has about 138 million (approx.)
References
Yin, W., Kann, K., Yu, M. and Schütze, H., 2017. Comparative study of CNN and RNN for
natural language processing. arXiv preprint arXiv:1702.01923.
Sherstinsky, A., 2020. Fundamentals of recurrent neural network (RNN) and long short-term
memory (LSTM) network. Physica D: Nonlinear Phenomena, 404, p.132306.
Choudhary, K., DeCost, B., Chen, C., Jain, A., Tavazza, F., Cohn, R., Park, C.W., Choudhary,
A., Agrawal, A., Billinge, S.J. and Holm, E., 2022. Recent advances and applications of deep
learning methods in materials science. npj Computational Materials, 8(1), p.59.
Zuo, C., Qian, J., Feng, S., Yin, W., Li, Y., Fan, P., Han, J., Qian, K. and Chen, Q., 2022. Deep
learning in optical metrology: a review. Light: Science & Applications, 11(1), p.39.
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H. and
Bengio, Y., 2014. Learning phrase representations using RNN encoder-decoder for statistical
machine translation. arXiv preprint arXiv:1406.1078.
References
Koutnik, J., Greff, K., Gomez, F. and Schmidhuber, J., 2014, June. A clockwork rnn.
In International conference on machine learning (pp. 1863-1871). PMLR.
Li, S., Li, W., Cook, C., Zhu, C. and Gao, Y., 2018. Independently recurrent neural network
(indrnn): Building a longer and deeper rnn. In Proceedings of the IEEE conference on
computer vision and pattern recognition (pp. 5457-5466).
Sherstinsky, A., 2020. Fundamentals of recurrent neural network (RNN) and long short-term
memory (LSTM) network. Physica D: Nonlinear Phenomena, 404, p.132306.
Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., Liu, T., Wang, X., Wang, G., Cai,
J. and Chen, T., 2018. Recent advances in convolutional neural networks. Pattern recognition,
77, pp.354-377.
Li, Z., Liu, F., Yang, W., Peng, S. and Zhou, J., 2021. A survey of convolutional neural
networks: analysis, applications, and prospects. IEEE transactions on neural networks and
learning systems.
References
Albawi, S., Mohammed, T.A. and Al-Zawi, S., 2017, August. Understanding of a convolutional
neural network. In 2017 international conference on engineering and technology (ICET) (pp.
1-6). Ieee.
O'Shea, K. and Nash, R., 2015. An introduction to convolutional neural networks. arXiv
preprint arXiv:1511.08458.
O'Shea, K. and Nash, R., 2015. An introduction to convolutional neural networks. arXiv
preprint arXiv:1511.08458.
Milosevic, N., Corchero, M., Gad, A.F. and Michelucci, U., 2020. Introduction to
convolutional neural networks: with image classification using PyTorch. Apress.
Albawi, S., Mohammed, T.A. and Al-Zawi, S., 2017, August. Understanding of a convolutional
neural network. In 2017 international conference on engineering and technology (ICET) (pp.
1-6). Ieee.
References
Yang, Q., 2008. An introduction to transfer learning. In Advanced Data Mining and
Applications: 4th International Conference, ADMA 2008, Chengdu, China, October 8-10,
2008. Proceedings 4 (pp. 1-1). Springer Berlin Heidelberg.
Torrey, L. and Shavlik, J., 2010. Transfer learning. In Handbook of research on machine
learning applications and trends: algorithms, methods, and techniques (pp. 242-264). IGI
global.