0% found this document useful (0 votes)

5 views34 pages

Deep Learning

The document provides an overview of deep learning, focusing on its types, particularly Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), and their architectures. It discusses the importance of sequence models in RNNs, the calculation of outputs, and various applications of deep learning in tasks such as image classification and natural language processing. Additionally, it covers concepts like transfer learning, regularization techniques, and references key studies in the field.

Uploaded by

As /if

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views34 pages

Deep Learning

Uploaded by

As /if

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 34

Presentation on

Deep Learning: Classifications and Their Model,

Overview

‾ Introduction of Deep Learning (DL)

‾ Types of DL
‾ Recurrent Neural Networks
‾ Sequence Models
‾ Notation
‾ Why RNNs
‾ Calculation The Output Of Recurrent Neural Networks
‾ Architecture of RNNs
‾ Convolutional Neural Network
‾ Architecture of CNNs and Its Layer
‾ Transfer Learning
INTRODUCTION TO DEEP LEARNING

Deep learning is a subset of machine learning that uses neural networks with many layers to learn
complex representations of data. So, neural networks are a key component of deep learning.

Deep learning has shown impressive performance on a wide range of tasks, including:
 Image classification

 Object detection and recognition

 Automatic Text Generation
 Language translation
 Sentiment analysis
INTRODUCTION OF DOCUMENT EMBEDDING

Deep Learning is gaining much popularity due to it’s supremacy in terms of accuracy when
trained with huge amount of data.
TYPES OF DEEP LEARNING

The most widely used architectures in deep learning are convolutional neural networks (CNNs),
and recurrent neural networks (RNNs).

 Convolutional Neural Networks (CNNs) are specifically for image and video recognition
tasks. CNNs are able to automatically learn features from the images, which makes them
well-suited for tasks such as image classification, object detection, and image segmentation.

• Recurrent Neural Networks (RNNs) are a type of neural network that is able to process
SEQUENTIAL DATA, such as time series and natural language (NLP). RNNs are able to
maintain an internal state that captures information about the previous inputs, which makes
them well-suited for tasks such as speech recognition, natural language processing, and
language translation.
RECURRENT NEURAL NETWORKS

 Input Layers : the information comes to the input

layer
 Hidden Layer: Information passed to input layers
to the hidden layer
 Output Layer: After this phase finally, the output
is produced

 They are recurrent

 In neural networks we give all of the
features at the same time
 In the RNNs we give all of the features
one step at a time (a time step manner).
BEFORE STARTING RECURRENT NEURAL NETWORKS -
SEQUENCE MODELS

Why Sequence models are Important in RNNs

 RNNs have transformed speech recognition,
natural language processing and other areas
 In speech recognition both X and Y are
sequence data, because X is an audio clip
and so that plays out over time and Y the
output, that is a sequence of words.
 In sentiment classification X is a sequence,
so given the input phrase like, "trendyol is
one of the best eCommerce platforms in
Turkiye" Thus, how many stars do You
think this review will be?
 In DNA sequence analysis, DNA sequence
can we label which part of this DNA
sequence say corresponds to a protein.

Thus input and output X and Y both length of sequence can be different
BEFORE START RECURRENT NEURAL NETWORKS –
NOTATION
Example: Steve Jobs and Steve Wozniak Began Apple Computer in 1976

NAMED-ENTITY RECOGNITION
X: Steve Jobs and Steve Wozniak began Apple Computer in 1976
= 10
Y: 1 1 0 1 1 0 0 0 0
= 10

Input sequence = 10 words

Features = 10 sets
X <T> = Index into positions, in the middle of the sequence
T = Temporal sequences although whether the sequences are temporal one or not
<T> = Index into the positions in the sequence
XTx = Length of the input sequence
Ty = The length of the output sequence.
BEFORE START RECURRENT NEURAL NETWORKS –
NOTATION

Steve Jobs and Steve Wozniak began Apple Computer in 1976

………………………………………….

Vocabulary
WHY RECURRENT NEURAL NETWORKS ?

Issues
 The inputs and outputs can be different lengths and different examples
 It doesn't share features learned across different positions of texts
CALCULATE THE OUTPUT OF RECURRENT NEURAL NETWORKS

RNN Output Formula

TYPES OF USING RECURRENT NEURAL NETWORKS

 Sequence to Sequence – Price Forecasting

 Sequence to Vector – Scam or Not: sentiment analysis
 Vector to Sequence – Image Captioning
 Encoder – Decoder – Translation
DIFFERENT TYPES OF RECURRENT NEURAL NETWORKS

Types of RNNs Example

One-to-one T x = T y = 1 T_x=T_y=1 Tx=Ty=1 Traditional neural network

One-to-many T x = 1, T y > 1 T_x=1, T_y>1 Tx=1, Ty>1 Music generation

Many-to-one T x > 1, T y = 1 T_x>1, T_y=1 Tx>1, Ty=1 Sentiment classification

Many-to-many T x = T y T_x=T_y Tx=Ty Name entity recognition

BACK PROPAGATION THROUGH TIME STEP

 Network output is calculated in its unrolled form

 Cost is calculated
 Gradients are calculated
 Gradients are passed back and weights are updated
 Weights matrixes are all the same as it actually in one cell
ARCHITECTURES OF RNNs- LONG SHORT-TERM MEMORY NETWORKS (LSTM)

Gate controllers
 Outputs 0 or 1

 Forget Gate: which part of the long term memory should be removed.

 Input gate: which part of the g should be added to the long term memory.

 Output gate: which part of the long term state should be added to the long term state should be added
to the output time step .
ARCHITECTURES OF RNNs - GATED RECURRENT UNITS

 Simplified version of the LSTM cell.

 There is not separate output, the whole state is output at every time step.
 The gate controller r decided which part of the previous will be shown to the main layer (g).
CONVOLUTIONAL NEURAL NETWORKS

 Identify, classify, and visualize the image data

 CNN extracts the features from images, can solve the problem using its layers

Architectures

There are three major types of layers in a CNN:

 Convolutional layers
 Pooling layers and
 Fully connected (dense) layers.
CONVOLUTIONAL NEURAL NETWORKS LAYERS

Input Layer Convolutional Layer

 It contains image data where the images are  This Layer considered a principle structure of CNN with
matrix of pixel values. several filters that operate a convolutional operation.

 In gray scale image, the value of each pixel in  Primary purpose of convolution in case of a CNN is to
the matrix from 0 to 255 extract features from the input image.

 RGB image, each pixel will have the combined  This layers parameters consist of a set of learnable filters.
values of R, G and B.
CONVOLUTIONAL NEURAL NETWORKS LAYERS

If we have an input of size W x W x D and Dout number of kernels with a spatial size of F with stride S
and amount of padding P, then the size of output volume can be determined by the following formula:

𝑾 − 𝑭 +𝟐 𝑷
𝑾𝒐𝒖𝒕 = +𝟏
𝑺
Where:

 W is the width and height of the input volume

(assuming it is square)
 F is the spatial size of the convolutional
filter/kernel
 P is the amount of padding applied to the input
volume
 S is the stride of the convolutional filter/kernel
CONVOLUTIONAL NEURAL NETWORKS LAYERS

Example of Convolutional Operation:

CONVOLUTIONAL NEURAL NETWORKS LAYERS

 Stride: It is a hyperparameter that determines the step size of the convolutional filter as it moves
across the input image or the feature map.

 Padding: in CNN stands for "zero-padding" or "zero-padding around the borders of an input
volume with zeros. Padding is a technique used in convolutional neural networks (CNNs) to
preserve the spatial dimensions of the input volume across convolutional layers.
CONVOLUTIONAL NEURAL NETWORKS LAYERS

 Pooling layers to reduce the size of the representation of data

 Max-pooling: It discriminated against the maximal norm in each window that was
eventually covered by the filters.

 Average pooling: takes the average value of a set of numbers in a small region of the
feature map and replaces the region with that value.
CONVOLUTIONAL NEURAL NETWORKS LAYERS

 Optimizer: an algorithm that updates the weights and biases of the network during the
training process. The goal of the optimizer is to minimize the error between the predicted
output of the CNN and the actual output. Example: Stochastic Gradient Descent (SGD), Adam,

Hyperparameters
 Number of Convolutional Layers: 2
 Number of Filters: 32 in the first layer, and 64 in the second layer.
 Kernel Size: 3x3 for both layers.
 Stride: 1 for both layers.
 Padding: 'same' padding is used to maintain the input size.
 Pooling: Max pooling with a pool size of 2x2 is applied after each convolutional layer.
 Learning Rate: 0.001
 Dropout Rate: 0.25
 Batch Size: 32
CONVOLUTIONAL NEURAL NETWORKS LAYERS

 Flatten Layer: Multidimensional input to a vector i.e., one dimensional.

 Flattening is converting the data into a 1-dimensional array for inputting it to the next layer.
CONVOLUTIONAL NEURAL NETWORKS LAYERS

Fully Connected layer

 It connects every neuron in one layer to every

neuron in the other layer.

 Pooling layers are 3D volumes, but a fully

connected layer expects a1D vector of
numbers.

 After FC layer SoftMax activation function are

used.

Activation function: is often used to convert the output of the layer into a probability distribution
over the possible classes.

Example: if the output of the FC layer for a particular input is [2.0, 1.5, 0.5], applying the SoftMax
function would transform it into [0.446, 0.335, 0.219], which represents the probabilities of the input
belonging to each of the three possible classes.
CONVOLUTIONAL NEURAL NETWORKS

Regularization

 Commonly used in deep learning models to prevent overfitting/under fitting during the training
model.

 Drop out is one of regularization technique.

 It drops randomly and temporally the neurons.

 Early stopping: This involves monitoring the performance of the model on a validation set during
training and stopping the training process when the performance on the validation set begins to
degrade.
CONVOLUTIONAL NEURAL NETWORKS: Regularization

Data Augmentation

 Data augmentation is done dynamically

during training time.

 Common transformations are rotation,

shifting, resizing, exposure adjustment,
contrast change etc. .

 Data augmentation is only applied on the

training data
CONVOLUTIONAL NEURAL NETWORKS

Transfer Learning

 Helps to recover issues in order to insufficient

training data.

 Pre-trained model has been trained on a large

dataset, such as ImageNet

 Allows to train models with much less data.

 it help to prevent overfitting

 Finally, it can improve the accuracy of models

on the new task, as the pre-trained model has
already learned to recognize many relevant
features in images.
DEEP TRANSFER LEARNING MODEL

 DenseNet201 – it has 201 layers depth

 MobilNetV2 perform well on mobile devices

 VGG16 - it has 16 layers depth and it has about 138 million (approx.)

 AlexNet - 8 layers deep

 Other Xception, Inception, EfficientNet etc

SUMMARY

 Introduction of RNNs and Example with Explanation

 Different essential model: Sequential model and Notations

 The popular architecture of RNNs – LSTM and GRU

 Beginning of the CNNs

 Different layers with example

 Regulation

 Short discussion on TL and their architectures

 VGG16 - it has 16 layers depth and it has about 138 million (approx.)
References

Yin, W., Kann, K., Yu, M. and Schütze, H., 2017. Comparative study of CNN and RNN for
natural language processing. arXiv preprint arXiv:1702.01923.

Sherstinsky, A., 2020. Fundamentals of recurrent neural network (RNN) and long short-term
memory (LSTM) network. Physica D: Nonlinear Phenomena, 404, p.132306.

Choudhary, K., DeCost, B., Chen, C., Jain, A., Tavazza, F., Cohn, R., Park, C.W., Choudhary,
A., Agrawal, A., Billinge, S.J. and Holm, E., 2022. Recent advances and applications of deep
learning methods in materials science. npj Computational Materials, 8(1), p.59.

Zuo, C., Qian, J., Feng, S., Yin, W., Li, Y., Fan, P., Han, J., Qian, K. and Chen, Q., 2022. Deep
learning in optical metrology: a review. Light: Science & Applications, 11(1), p.39.

Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H. and
Bengio, Y., 2014. Learning phrase representations using RNN encoder-decoder for statistical
machine translation. arXiv preprint arXiv:1406.1078.
References

Koutnik, J., Greff, K., Gomez, F. and Schmidhuber, J., 2014, June. A clockwork rnn.
In International conference on machine learning (pp. 1863-1871). PMLR.

Li, S., Li, W., Cook, C., Zhu, C. and Gao, Y., 2018. Independently recurrent neural network
(indrnn): Building a longer and deeper rnn. In Proceedings of the IEEE conference on
computer vision and pattern recognition (pp. 5457-5466).

Sherstinsky, A., 2020. Fundamentals of recurrent neural network (RNN) and long short-term
memory (LSTM) network. Physica D: Nonlinear Phenomena, 404, p.132306.

Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., Liu, T., Wang, X., Wang, G., Cai,
J. and Chen, T., 2018. Recent advances in convolutional neural networks. Pattern recognition,
77, pp.354-377.

Li, Z., Liu, F., Yang, W., Peng, S. and Zhou, J., 2021. A survey of convolutional neural
networks: analysis, applications, and prospects. IEEE transactions on neural networks and
learning systems.
References

Albawi, S., Mohammed, T.A. and Al-Zawi, S., 2017, August. Understanding of a convolutional
neural network. In 2017 international conference on engineering and technology (ICET) (pp.
1-6). Ieee.

O'Shea, K. and Nash, R., 2015. An introduction to convolutional neural networks. arXiv
preprint arXiv:1511.08458.

Milosevic, N., Corchero, M., Gad, A.F. and Michelucci, U., 2020. Introduction to
convolutional neural networks: with image classification using PyTorch. Apress.

Albawi, S., Mohammed, T.A. and Al-Zawi, S., 2017, August. Understanding of a convolutional
neural network. In 2017 international conference on engineering and technology (ICET) (pp.
1-6). Ieee.
References

Yang, Q., 2008. An introduction to transfer learning. In Advanced Data Mining and
Applications: 4th International Conference, ADMA 2008, Chengdu, China, October 8-10,
2008. Proceedings 4 (pp. 1-1). Springer Berlin Heidelberg.

Torrey, L. and Shavlik, J., 2010. Transfer learning. In Handbook of research on machine
learning applications and trends: algorithms, methods, and techniques (pp. 242-264). IGI
global.

TensorFlow in 1 Day: Make your own Neural Network
From Everand
TensorFlow in 1 Day: Make your own Neural Network
Krishna Rungta
3.5/5 (10)
PID Examples
No ratings yet
PID Examples
15 pages
Lecture 3 V33
No ratings yet
Lecture 3 V33
52 pages
2630_20230529_Mahdi__Momen_Aldawood_hh_15261_946399124 (1)
No ratings yet
2630_20230529_Mahdi__Momen_Aldawood_hh_15261_946399124 (1)
11 pages
Unit 6
No ratings yet
Unit 6
41 pages
DL UNIT-II
No ratings yet
DL UNIT-II
36 pages
AIDS-II PT1 Question Bank
No ratings yet
AIDS-II PT1 Question Bank
27 pages
Top 10 Neural Network Architectures You Need To Know: 1 - Perceptrons
No ratings yet
Top 10 Neural Network Architectures You Need To Know: 1 - Perceptrons
12 pages
IC Unit6 DeepLearning
No ratings yet
IC Unit6 DeepLearning
35 pages
Unit Iv (CNN)
No ratings yet
Unit Iv (CNN)
8 pages
DL mod 3
No ratings yet
DL mod 3
4 pages
UNIT - 5 Lecture 2
No ratings yet
UNIT - 5 Lecture 2
26 pages
2111CS010077 deep learning
No ratings yet
2111CS010077 deep learning
10 pages
Unit 3 NNDL-1
No ratings yet
Unit 3 NNDL-1
31 pages
UNIT-2 DL
No ratings yet
UNIT-2 DL
51 pages
Unit 5
No ratings yet
Unit 5
8 pages
DLA Unit 4
No ratings yet
DLA Unit 4
38 pages
Eng Ppt Tech
No ratings yet
Eng Ppt Tech
18 pages
Various Neural Network Architect Assignment Questions
No ratings yet
Various Neural Network Architect Assignment Questions
9 pages
Neural Network (RNN & CNN)
No ratings yet
Neural Network (RNN & CNN)
31 pages
Unit_4
No ratings yet
Unit_4
13 pages
Unit 4
No ratings yet
Unit 4
86 pages
Neural Network Architectures
No ratings yet
Neural Network Architectures
32 pages
ENG6500 8 DL IntroductionToDeepLearning Part2
No ratings yet
ENG6500 8 DL IntroductionToDeepLearning Part2
65 pages
RNN and LSTM
No ratings yet
RNN and LSTM
15 pages
Foundations of Machine Learning: Module 6: Neural Network
No ratings yet
Foundations of Machine Learning: Module 6: Neural Network
22 pages
Deep Learning Updated
No ratings yet
Deep Learning Updated
11 pages
AIDS II (1)
No ratings yet
AIDS II (1)
42 pages
DL - FNN - RNN
No ratings yet
DL - FNN - RNN
5 pages
CNN RNN LSTM Attention
No ratings yet
CNN RNN LSTM Attention
86 pages
Notes DL-1
No ratings yet
Notes DL-1
10 pages
Unit 3 RCNN Updated
No ratings yet
Unit 3 RCNN Updated
28 pages
Machine Learning
No ratings yet
Machine Learning
11 pages
Introduction to Convolutional Neural Networks (1)
No ratings yet
Introduction to Convolutional Neural Networks (1)
4 pages
DL Decode Endsem
No ratings yet
DL Decode Endsem
71 pages
DL Unit-3
No ratings yet
DL Unit-3
9 pages
UNIT-III DeepLearning Notes
No ratings yet
UNIT-III DeepLearning Notes
30 pages
1.5 Types of Network Architectures (1)
No ratings yet
1.5 Types of Network Architectures (1)
26 pages
DL Unit4
No ratings yet
DL Unit4
31 pages
mergeddv
No ratings yet
mergeddv
2 pages
Chapter 5 Deep Learning
No ratings yet
Chapter 5 Deep Learning
35 pages
MLT UNIT-4 & 5 imp sol
No ratings yet
MLT UNIT-4 & 5 imp sol
22 pages
Lecture Notes on Lecture Notes on Deep Learning.docx
No ratings yet
Lecture Notes on Lecture Notes on Deep Learning.docx
8 pages
new
No ratings yet
new
8 pages
Introduction to Deep Learning
No ratings yet
Introduction to Deep Learning
47 pages
Unit III
No ratings yet
Unit III
89 pages
slides_rnn
No ratings yet
slides_rnn
75 pages
NNDL U-3
No ratings yet
NNDL U-3
7 pages
Cnn
No ratings yet
Cnn
56 pages
19 Deep Learning
100% (1)
19 Deep Learning
49 pages
Project Exhibition 2
No ratings yet
Project Exhibition 2
42 pages
Chapter One
No ratings yet
Chapter One
9 pages
Deep Learning Unit 5
No ratings yet
Deep Learning Unit 5
23 pages
Deep Learning L3
No ratings yet
Deep Learning L3
37 pages
Cnn
No ratings yet
Cnn
73 pages
Unit 2 QUESTIONS and ANSWERS
No ratings yet
Unit 2 QUESTIONS and ANSWERS
26 pages
lec-10
No ratings yet
lec-10
37 pages
Recurrent Neural Network: Dr. Sukanta Ghosh
100% (1)
Recurrent Neural Network: Dr. Sukanta Ghosh
34 pages
UNIT-1 Foundations of Deep Learning
100% (1)
UNIT-1 Foundations of Deep Learning
51 pages
deep learning UNIT 1
No ratings yet
deep learning UNIT 1
22 pages
BMM 2018 - Deep Learning Tutorial
No ratings yet
BMM 2018 - Deep Learning Tutorial
47 pages
Data Structures Semester 3
No ratings yet
Data Structures Semester 3
329 pages
NNDL Assignment-2 Report
No ratings yet
NNDL Assignment-2 Report
9 pages
23.03.27-Kiem-tra-DLHDK-Sol
No ratings yet
23.03.27-Kiem-tra-DLHDK-Sol
1 page
CO-PO Attentment Computer Dept PG
No ratings yet
CO-PO Attentment Computer Dept PG
17 pages
Enrichment w2 2 PDF
No ratings yet
Enrichment w2 2 PDF
5 pages
Prediction of Mechanical Properties For Deep Drawing Steel by Deep Learning
No ratings yet
Prediction of Mechanical Properties For Deep Drawing Steel by Deep Learning
10 pages
Computers and Operations Research: Amina Lamghari, Roussos Dimitrakopoulos
No ratings yet
Computers and Operations Research: Amina Lamghari, Roussos Dimitrakopoulos
18 pages
Speech Processing: at Least TWO Questions From Each Part
No ratings yet
Speech Processing: at Least TWO Questions From Each Part
1 page
5 Big M and Two-Phase Methods
No ratings yet
5 Big M and Two-Phase Methods
11 pages
Simplifications (1)
No ratings yet
Simplifications (1)
6 pages
Instant Download Model Based Predictive Control A Practical Approach Control Series 1st Edition J.A. Rossiter PDF All Chapters
100% (13)
Instant Download Model Based Predictive Control A Practical Approach Control Series 1st Edition J.A. Rossiter PDF All Chapters
60 pages
Trip Generation and Distribution
No ratings yet
Trip Generation and Distribution
33 pages
RNN Part1
No ratings yet
RNN Part1
12 pages
2. HW - Worksheet A Topic 2.3 Exponential Functions
No ratings yet
2. HW - Worksheet A Topic 2.3 Exponential Functions
2 pages
Abaqus
No ratings yet
Abaqus
4 pages
Introduction to Management Science 13th Edition Anderson Solutions Manual pdf download
100% (1)
Introduction to Management Science 13th Edition Anderson Solutions Manual pdf download
45 pages
Entropy Coding
No ratings yet
Entropy Coding
2 pages
Stochiastic Time Series
No ratings yet
Stochiastic Time Series
49 pages
319 - MECE-001 - ENG D18 - Compressed PDF
No ratings yet
319 - MECE-001 - ENG D18 - Compressed PDF
3 pages
Prerequisite Quiz
No ratings yet
Prerequisite Quiz
3 pages
Filter Design Report
No ratings yet
Filter Design Report
28 pages
Asım et al. - Unknown - A Vehicle Detection Approach using Deep Learning Methodologies-annotated
No ratings yet
Asım et al. - Unknown - A Vehicle Detection Approach using Deep Learning Methodologies-annotated
7 pages
Fast Convolution
No ratings yet
Fast Convolution
2 pages
Computer Science 37 HW 1
No ratings yet
Computer Science 37 HW 1
3 pages
6.31.7. Collapse Recorder Command - OpenSeesPy 3.5.1.3 Documentation
No ratings yet
6.31.7. Collapse Recorder Command - OpenSeesPy 3.5.1.3 Documentation
6 pages
Lecture Slides For Chapter 1 of Deep Learning Ian Goodfellow 2016-09-26
No ratings yet
Lecture Slides For Chapter 1 of Deep Learning Ian Goodfellow 2016-09-26
13 pages
Association Rules & Sequential Patterns: Road Map
No ratings yet
Association Rules & Sequential Patterns: Road Map
24 pages
Notes 20230430194201
No ratings yet
Notes 20230430194201
4 pages
Theory of Computation
No ratings yet
Theory of Computation
373 pages