0% found this document useful (0 votes)
5 views34 pages

Deep Learning

The document provides an overview of deep learning, focusing on its types, particularly Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), and their architectures. It discusses the importance of sequence models in RNNs, the calculation of outputs, and various applications of deep learning in tasks such as image classification and natural language processing. Additionally, it covers concepts like transfer learning, regularization techniques, and references key studies in the field.

Uploaded by

As /if
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views34 pages

Deep Learning

The document provides an overview of deep learning, focusing on its types, particularly Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), and their architectures. It discusses the importance of sequence models in RNNs, the calculation of outputs, and various applications of deep learning in tasks such as image classification and natural language processing. Additionally, it covers concepts like transfer learning, regularization techniques, and references key studies in the field.

Uploaded by

As /if
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 34

Presentation on

Deep Learning: Classifications and Their Model,


Overview

‾ Introduction of Deep Learning (DL)


‾ Types of DL
‾ Recurrent Neural Networks
‾ Sequence Models
‾ Notation
‾ Why RNNs
‾ Calculation The Output Of Recurrent Neural Networks
‾ Architecture of RNNs
‾ Convolutional Neural Network
‾ Architecture of CNNs and Its Layer
‾ Transfer Learning
INTRODUCTION TO DEEP LEARNING

Deep learning is a subset of machine learning that uses neural networks with many layers to learn
complex representations of data. So, neural networks are a key component of deep learning.

Deep learning has shown impressive performance on a wide range of tasks, including:
 Image classification

 Object detection and recognition


 Automatic Text Generation
 Language translation
 Sentiment analysis
INTRODUCTION OF DOCUMENT EMBEDDING

Deep Learning is gaining much popularity due to it’s supremacy in terms of accuracy when
trained with huge amount of data.
TYPES OF DEEP LEARNING

The most widely used architectures in deep learning are convolutional neural networks (CNNs),
and recurrent neural networks (RNNs).

 Convolutional Neural Networks (CNNs) are specifically for image and video recognition
tasks. CNNs are able to automatically learn features from the images, which makes them
well-suited for tasks such as image classification, object detection, and image segmentation.

• Recurrent Neural Networks (RNNs) are a type of neural network that is able to process
SEQUENTIAL DATA, such as time series and natural language (NLP). RNNs are able to
maintain an internal state that captures information about the previous inputs, which makes
them well-suited for tasks such as speech recognition, natural language processing, and
language translation.
RECURRENT NEURAL NETWORKS

 Input Layers : the information comes to the input


layer
 Hidden Layer: Information passed to input layers
to the hidden layer
 Output Layer: After this phase finally, the output
is produced

 They are recurrent


 In neural networks we give all of the
features at the same time
 In the RNNs we give all of the features
one step at a time (a time step manner).
BEFORE STARTING RECURRENT NEURAL NETWORKS -
SEQUENCE MODELS

Why Sequence models are Important in RNNs


 RNNs have transformed speech recognition,
natural language processing and other areas
 In speech recognition both X and Y are
sequence data, because X is an audio clip
and so that plays out over time and Y the
output, that is a sequence of words.
 In sentiment classification X is a sequence,
so given the input phrase like, "trendyol is
one of the best eCommerce platforms in
Turkiye" Thus, how many stars do You
think this review will be?
 In DNA sequence analysis, DNA sequence
can we label which part of this DNA
sequence say corresponds to a protein.

Thus input and output X and Y both length of sequence can be different
BEFORE START RECURRENT NEURAL NETWORKS –
NOTATION
Example: Steve Jobs and Steve Wozniak Began Apple Computer in 1976

NAMED-ENTITY RECOGNITION
X: Steve Jobs and Steve Wozniak began Apple Computer in 1976
= 10
Y: 1 1 0 1 1 0 0 0 0
= 10

Input sequence = 10 words


Features = 10 sets
X <T> = Index into positions, in the middle of the sequence
T = Temporal sequences although whether the sequences are temporal one or not
<T> = Index into the positions in the sequence
XTx = Length of the input sequence
Ty = The length of the output sequence.
BEFORE START RECURRENT NEURAL NETWORKS –
NOTATION

Steve Jobs and Steve Wozniak began Apple Computer in 1976

………………………………………….

Vocabulary
WHY RECURRENT NEURAL NETWORKS ?

Issues
 The inputs and outputs can be different lengths and different examples
 It doesn't share features learned across different positions of texts
CALCULATE THE OUTPUT OF RECURRENT NEURAL NETWORKS

RNN Output Formula


TYPES OF USING RECURRENT NEURAL NETWORKS

 Sequence to Sequence – Price Forecasting


 Sequence to Vector – Scam or Not: sentiment analysis
 Vector to Sequence – Image Captioning
 Encoder – Decoder – Translation
DIFFERENT TYPES OF RECURRENT NEURAL NETWORKS

Types of RNNs Example


One-to-one T x = T y = 1 T_x=T_y=1 Tx=Ty=1 Traditional neural network

One-to-many T x = 1, T y > 1 T_x=1, T_y>1 Tx=1, Ty>1 Music generation

Many-to-one T x > 1, T y = 1 T_x>1, T_y=1 Tx>1, Ty=1 Sentiment classification

Many-to-many T x = T y T_x=T_y Tx=Ty Name entity recognition


BACK PROPAGATION THROUGH TIME STEP

 Network output is calculated in its unrolled form


 Cost is calculated
 Gradients are calculated
 Gradients are passed back and weights are updated
 Weights matrixes are all the same as it actually in one cell
ARCHITECTURES OF RNNs- LONG SHORT-TERM MEMORY NETWORKS (LSTM)

Gate controllers
 Outputs 0 or 1

 Forget Gate: which part of the long term memory should be removed.

 Input gate: which part of the g should be added to the long term memory.

 Output gate: which part of the long term state should be added to the long term state should be added
to the output time step .
ARCHITECTURES OF RNNs - GATED RECURRENT UNITS

 Simplified version of the LSTM cell.


 There is not separate output, the whole state is output at every time step.
 The gate controller r decided which part of the previous will be shown to the main layer (g).
CONVOLUTIONAL NEURAL NETWORKS

 Identify, classify, and visualize the image data


 CNN extracts the features from images, can solve the problem using its layers

Architectures

There are three major types of layers in a CNN:


 Convolutional layers
 Pooling layers and
 Fully connected (dense) layers.
CONVOLUTIONAL NEURAL NETWORKS LAYERS

Input Layer Convolutional Layer


 It contains image data where the images are  This Layer considered a principle structure of CNN with
matrix of pixel values. several filters that operate a convolutional operation.

 In gray scale image, the value of each pixel in  Primary purpose of convolution in case of a CNN is to
the matrix from 0 to 255 extract features from the input image.

 RGB image, each pixel will have the combined  This layers parameters consist of a set of learnable filters.
values of R, G and B.
CONVOLUTIONAL NEURAL NETWORKS LAYERS

If we have an input of size W x W x D and Dout number of kernels with a spatial size of F with stride S
and amount of padding P, then the size of output volume can be determined by the following formula:

𝑾 − 𝑭 +𝟐 𝑷
𝑾𝒐𝒖𝒕 = +𝟏
𝑺
Where:

 W is the width and height of the input volume


(assuming it is square)
 F is the spatial size of the convolutional
filter/kernel
 P is the amount of padding applied to the input
volume
 S is the stride of the convolutional filter/kernel
CONVOLUTIONAL NEURAL NETWORKS LAYERS

Example of Convolutional Operation:


CONVOLUTIONAL NEURAL NETWORKS LAYERS

 Stride: It is a hyperparameter that determines the step size of the convolutional filter as it moves
across the input image or the feature map.

 Padding: in CNN stands for "zero-padding" or "zero-padding around the borders of an input
volume with zeros. Padding is a technique used in convolutional neural networks (CNNs) to
preserve the spatial dimensions of the input volume across convolutional layers.
CONVOLUTIONAL NEURAL NETWORKS LAYERS

 Pooling layers to reduce the size of the representation of data

 Max-pooling: It discriminated against the maximal norm in each window that was
eventually covered by the filters.

 Average pooling: takes the average value of a set of numbers in a small region of the
feature map and replaces the region with that value.
CONVOLUTIONAL NEURAL NETWORKS LAYERS

 Optimizer: an algorithm that updates the weights and biases of the network during the
training process. The goal of the optimizer is to minimize the error between the predicted
output of the CNN and the actual output. Example: Stochastic Gradient Descent (SGD), Adam,

Hyperparameters
 Number of Convolutional Layers: 2
 Number of Filters: 32 in the first layer, and 64 in the second layer.
 Kernel Size: 3x3 for both layers.
 Stride: 1 for both layers.
 Padding: 'same' padding is used to maintain the input size.
 Pooling: Max pooling with a pool size of 2x2 is applied after each convolutional layer.
 Learning Rate: 0.001
 Dropout Rate: 0.25
 Batch Size: 32
CONVOLUTIONAL NEURAL NETWORKS LAYERS

 Flatten Layer: Multidimensional input to a vector i.e., one dimensional.

 Flattening is converting the data into a 1-dimensional array for inputting it to the next layer.
CONVOLUTIONAL NEURAL NETWORKS LAYERS

Fully Connected layer

 It connects every neuron in one layer to every


neuron in the other layer.

 Pooling layers are 3D volumes, but a fully


connected layer expects a1D vector of
numbers.

 After FC layer SoftMax activation function are


used.

Activation function: is often used to convert the output of the layer into a probability distribution
over the possible classes.

Example: if the output of the FC layer for a particular input is [2.0, 1.5, 0.5], applying the SoftMax
function would transform it into [0.446, 0.335, 0.219], which represents the probabilities of the input
belonging to each of the three possible classes.
CONVOLUTIONAL NEURAL NETWORKS

Regularization

 Commonly used in deep learning models to prevent overfitting/under fitting during the training
model.

 Drop out is one of regularization technique.

 It drops randomly and temporally the neurons.

 Early stopping: This involves monitoring the performance of the model on a validation set during
training and stopping the training process when the performance on the validation set begins to
degrade.
CONVOLUTIONAL NEURAL NETWORKS: Regularization

Data Augmentation

 Data augmentation is done dynamically


during training time.

 Common transformations are rotation,


shifting, resizing, exposure adjustment,
contrast change etc. .

 Data augmentation is only applied on the


training data
CONVOLUTIONAL NEURAL NETWORKS

Transfer Learning

 Helps to recover issues in order to insufficient


training data.

 Pre-trained model has been trained on a large


dataset, such as ImageNet

 Allows to train models with much less data.

 it help to prevent overfitting

 Finally, it can improve the accuracy of models


on the new task, as the pre-trained model has
already learned to recognize many relevant
features in images.
DEEP TRANSFER LEARNING MODEL

 DenseNet201 – it has 201 layers depth

 MobilNetV2 perform well on mobile devices

 VGG16 - it has 16 layers depth and it has about 138 million (approx.)

 AlexNet - 8 layers deep

 Other Xception, Inception, EfficientNet etc


SUMMARY

 Introduction of RNNs and Example with Explanation

 Different essential model: Sequential model and Notations

 The popular architecture of RNNs – LSTM and GRU

 Beginning of the CNNs

 Different layers with example

 Regulation

 Short discussion on TL and their architectures

 VGG16 - it has 16 layers depth and it has about 138 million (approx.)
References

Yin, W., Kann, K., Yu, M. and Schütze, H., 2017. Comparative study of CNN and RNN for
natural language processing. arXiv preprint arXiv:1702.01923.

Sherstinsky, A., 2020. Fundamentals of recurrent neural network (RNN) and long short-term
memory (LSTM) network. Physica D: Nonlinear Phenomena, 404, p.132306.

Choudhary, K., DeCost, B., Chen, C., Jain, A., Tavazza, F., Cohn, R., Park, C.W., Choudhary,
A., Agrawal, A., Billinge, S.J. and Holm, E., 2022. Recent advances and applications of deep
learning methods in materials science. npj Computational Materials, 8(1), p.59.

Zuo, C., Qian, J., Feng, S., Yin, W., Li, Y., Fan, P., Han, J., Qian, K. and Chen, Q., 2022. Deep
learning in optical metrology: a review. Light: Science & Applications, 11(1), p.39.

Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H. and
Bengio, Y., 2014. Learning phrase representations using RNN encoder-decoder for statistical
machine translation. arXiv preprint arXiv:1406.1078.
References

Koutnik, J., Greff, K., Gomez, F. and Schmidhuber, J., 2014, June. A clockwork rnn.
In International conference on machine learning (pp. 1863-1871). PMLR.

Li, S., Li, W., Cook, C., Zhu, C. and Gao, Y., 2018. Independently recurrent neural network
(indrnn): Building a longer and deeper rnn. In Proceedings of the IEEE conference on
computer vision and pattern recognition (pp. 5457-5466).

Sherstinsky, A., 2020. Fundamentals of recurrent neural network (RNN) and long short-term
memory (LSTM) network. Physica D: Nonlinear Phenomena, 404, p.132306.

Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., Liu, T., Wang, X., Wang, G., Cai,
J. and Chen, T., 2018. Recent advances in convolutional neural networks. Pattern recognition,
77, pp.354-377.

Li, Z., Liu, F., Yang, W., Peng, S. and Zhou, J., 2021. A survey of convolutional neural
networks: analysis, applications, and prospects. IEEE transactions on neural networks and
learning systems.
References

Albawi, S., Mohammed, T.A. and Al-Zawi, S., 2017, August. Understanding of a convolutional
neural network. In 2017 international conference on engineering and technology (ICET) (pp.
1-6). Ieee.

O'Shea, K. and Nash, R., 2015. An introduction to convolutional neural networks. arXiv
preprint arXiv:1511.08458.

O'Shea, K. and Nash, R., 2015. An introduction to convolutional neural networks. arXiv
preprint arXiv:1511.08458.

Milosevic, N., Corchero, M., Gad, A.F. and Michelucci, U., 2020. Introduction to
convolutional neural networks: with image classification using PyTorch. Apress.

Albawi, S., Mohammed, T.A. and Al-Zawi, S., 2017, August. Understanding of a convolutional
neural network. In 2017 international conference on engineering and technology (ICET) (pp.
1-6). Ieee.
References

Yang, Q., 2008. An introduction to transfer learning. In Advanced Data Mining and
Applications: 4th International Conference, ADMA 2008, Chengdu, China, October 8-10,
2008. Proceedings 4 (pp. 1-1). Springer Berlin Heidelberg.

Torrey, L. and Shavlik, J., 2010. Transfer learning. In Handbook of research on machine
learning applications and trends: algorithms, methods, and techniques (pp. 242-264). IGI
global.

You might also like