0% found this document useful (0 votes)

64 views

DL Unit - III Notes1

Uploaded by

vennu.sanjana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

64 views

DL Unit - III Notes1

Uploaded by

vennu.sanjana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 14

Recurrent Neural Network (RNN)

UNIT-III
What is a Recurrent Neural Network (RNN)?

Definition: RNNs are designed for sequential data, where each neuron
connects to the next layer and to neurons within the same layer.

Recurrent neural networks (RNNs) are used by Apple’s Siri and Google’s
voice search. It is the first one that remembers its input, due to an internal
memory

Structure:

 Recurrent Connections: Neurons have connections looping back to

themselves, allowing information to persist.
 Hidden State: Maintains a memory of previous inputs.

Sequential data is data—such as words, sentences, or time-series

data—where sequential components interrelate based on complex
semantics and syntax rules.

Why Recurrent Neural Networks?

RNN were created because there were a few issues in the feed-
forward neural network:

 Cannot handle sequential data

 Considers only the current input

 Cannot memorize previous inputs

The solution to these issues is the RNN. An RNN can handle

sequential data, accepting the current input data, and previously
received inputs. RNNs can memorize previous inputs due to their
internal memory

In traditional neural networks, all the inputs and outputs are

independent of each other. However, at times, to predict the next
word in a sentence, the previous word is needed. Therefore, it is
important to remember the previous words. That is why the RNN
model was developed. It solves the issue by using the hidden layer.
Recurrent vs. Feed-Forward Neural Networks (or)

How RNN differs from Traditional neural network or ANN

1. ANN: Artificial neural networks that do not have looping nodes are
called feed forward neural networks.
RNN: In an RNN, the information cycles through a loop.

2. Feed-forward neural networks have no memory of the input

they receive and are bad at predicting what’s coming next.

RNN: Has internal memory

3. The two images below illustrate the difference in information flow

between an RNN and a feed-forward neural network.
Architecture of Recurrent Neural Network (RNN)
This architecture makes RNNs particularly suited for processing sequential
data.
RNNs are different than the classical multi-layer perceptron (MLP)
networks because of two main reasons:
1) They take into account what happened previously and
2) They share parameters/weights.

RNNs are a type of neural network that has hidden states and allows
past outputs to be used as inputs.

key components:

Input Layer: This layer receives the initial element of the sequence
data. For example, in a sentence, it might receive the first word as a
vector representation.

Hidden Layer: The heart of the RNN, the hidden layer contains a set
of interconnected neurons. Each neuron processes the current input
along with the information from the previous hidden layer’s state.
This “state” captures the network’s memory of past inputs, allowing
it to understand the current element in context.
Activation Function: This function introduces non-linearity into the
network, enabling it to learn complex patterns. It transforms the
combined input from the current input layer and the previous hidden
layer state before passing it on.

The following are some of the most commonly utilized functions:

Output Layer: The output layer generates the network’s prediction

based on the processed information. In a language model, it might
predict the next word in the sequence.

Recurrent Connection: A key distinction of RNNs is the recurrent

connection within the hidden layer. This connection allows the
network to pass the hidden state information (the network’s
memory) to the next time step. It’s like passing a baton in a relay
race, carrying information about previous inputs forward

 For calculating the current state,

ht= f (ht-1,xt)

Where:

xt is the input state

ht-1 is the previous state,
ht is the current state.

 For calculating the activation function

ht= f (ht-1,xt),
Where:
Wxh is the weight at input neuron,

Whh is the weight at recurrent neuron.

 For calculating output:

Yt=Whyht.
Where,
Yt is the Output and,
Why is the weight at the output layer.

Steps for Training a Recurrent Neural Network

Given below are few steps for training a recurrent neural network:

 In the input layers, the initial input is sent with all having the same
weight and activation function.
 Using the current input and the previous state output, the current
state is calculated.
 Now the current state ht will become ht-1 for the second time step.
 This keeps on repeating all the steps, and to solve any particular
problem, it can go on as many times to join the information from all
the previous steps.
 The final step is then calculated by the current state of the final state
and all other previous steps.
 Now an error is generated by calculating the difference between the
actual output and the output generated by our RNN model.
 The final step is when the process of backpropagation occurs
wherein the error is backpropagated to update the weights.

Types of RNNs:
Different types of architectures are used in recurrent neural networks:

1. One to one

2. One to Many

3. Many to One
4. Many to Many

One to one:

One to One RNN is the most basic and traditional type of Neural
network giving a single output for a single input. It is also known as
Vanilla Neural Network. It is used to solve regular machine learning
problems. e.g. image classification takes an image and outputs a single
classification word.

One to many:

A “one-to-many” architecture represents a scenario where the network

receives a single input but generates a sequence of outputs.

 Single Input: The RNN takes in a single piece of information as input.

This could be an image, a musical note, a short sentence, or any data
point that serves as a starting point for the network.

 Multiple Outputs: The RNN processes the input and generates a

sequence of outputs over time. This sequence can vary in length
depending on the specific task.

Some common applications of one to many RNNs:

 Image Captioning: Based on a single image input, the RNN generates a

sentence or paragraph describing the image content (multiple outputs).
 Music Generation: The RNN receives a single starting note or short
melody as input and produces a sequence of musical notes forming a
complete piece (multiple outputs).

Many to One :
In recurrent neural networks (RNNs), a “many-to-one” architecture refers
to a specific type of RNN where the network processes a sequence of
inputs but produces a single output.

 Many Inputs: The RNN takes in a sequence of data points over time.
This sequence could be words in a sentence, sensor readings over a
period, or financial data points for multiple days.

 Single Output: After processing the entire sequence, the RNN generates
a single output value. This output could be a classification
(positive/negative sentiment), a prediction (next value in the time
series), or a summary of the information in the sequence.

common applications of many-to-one RNNs:

 Sentiment Analysis: Given a sentence or review text (sequence of

words), classify its overall sentiment (positive, negative, or neutral) as
the single output.

 Spam Detection: Analyze an email’s content (sequence of words) to

determine if it’s spam (single output).

Many to Many:
In recurrent neural networks (RNNs), a “many-to-many” architecture
describes a scenario where the network processes a sequence of
inputs and generates a corresponding sequence of outputs. This means
both the input and output have multiple elements processed over time
steps.

 Multiple Inputs: The RNN takes in a sequence of data points, similar to

many-to-one RNNs. This sequence could be words in a sentence, sensor
readings, or financial data points.

 Multiple Outputs: The RNN generates a new sequence of data points,

with a length that may or may not be the same as the input sequence.

Common Applications of Many-to-Many RNNs:

 Machine Translation: Translate text from one language to another

(e.g., English to French).

 Video Captioning: Generate captions describing the content of a video

(sequence of video frames as input, sequence of words as output).

 Text Summarization: Summarize a long document into a shorter

version with key points (sequence of sentences as input, shorter
sequence of sentences as output).

Advantages of RNN:

 RNN can process inputs of any length.

 An RNN model is modeled to remember each information throughout
the time which is very helpful in any time series predictor.
 Even if the input size is larger, the model size does not increase.
 The weights can be shared across the time steps.
 RNN can use their internal memory for processing the arbitrary series
of inputs which is not the case with feedforward neural networks.

Disadvantages of RNN :

 Due to its recurrent nature, the computation is slow.

 Training of RNN models can be difficult.
 If we are using relu or tanh as activation functions, it becomes very
difficult to process sequences that are very long.
 Prone to problems such as exploding and gradient vanishing.

Applications of Recurrent Neural Network (RNN)

Recurrent Neural Networks (RNNs) have a wide range of applications

across various fields due to their ability to process sequential data
effectively:
1. Natural Language Processing (NLP):

 Language Modeling and Text Generation: RNNs can predict the

probability of each word in a sequence, which is useful for generating
text.
 Machine Translation: Translating text from one language to another.
 Speech Recognition: Converting spoken language into text.
 Sentiment Analysis: Analyzing text data to determine the sentiment
expressed (positive, negative, neutral).
2. Time Series Analysis:

 Stock Price Prediction: Predicting future stock prices based on

historical data.
 Weather Forecasting: Predicting weather conditions like
temperature and rainfall.
 Demand Forecasting: Forecasting future product demand in retail or
supply chain management.
3. Sequential Data Processing:

 Event Prediction: Predicting future events based on past sequence

data, such as predicting equipment failure in predictive maintenance.
 Anomaly Detection: Identifying unusual patterns that do not conform
to expected behavior.
4. Healthcare:

 Medical Diagnosis: Analyzing medical data over time, such as patient

health records or vital signs, for diagnostic purposes.
 Drug Discovery: Predicting the potential effectiveness of new drugs.
5. Audio and Music Generation:

 Music Composition: Creating new pieces of music.

 Voice Synthesis: Generating human-like speech from text.
6. Video Processing:

 Action Recognition: Identifying specific actions or activities in video

data.
 Video Classification: Categorizing video clips into different genres or
types.
7. Gaming

 Non-Player Character (NPC) Behavior: Creating more realistic and

responsive behaviors in NPCs.
 Game Strategy Analysis: Analyzing and predicting player actions.

Real Life RNN Use Case

The following are four useful sequence model applications.
1. Google Gmail

Google Gmail.

When you type a sentence, it will automatically complete it. Google has this
RNN embedded in it.

Autogenerate result.

2. Google Translate

Google Translate.
Translate sentence from one to another language

3. Named entity recognition (NER)

NER.

Given a statement, it will analyse the text to detect and classify entities.

4. Sentiment Analysis

Sentiment Analysis.

Given a statement, it will analyse text to determine the sentiment or

emotional tone expressed within it.

Advantages of RNN:
Ability To Handle Variable-Length Sequences: RNNs are designed to
handle input sequences of variable length, which makes them well-suited
for tasks such as speech recognition, natural language processing, and time
series analysis.

Memory of Past Inputs: RNNs have a memory of past inputs, which allows
them to capture information about the context of the input sequence. This
makes them useful for tasks such as language modeling, where the meaning
of a word depends on the context in which it appears.
Parameter Sharing: RNNs share the same set of parameters across all
time steps, which reduces the number of parameters that need to be
learned and can lead to better generalization.

Non-Linear Mapping: RNNs use non-linear activation functions, which

allows them to learn complex, non-linear mappings between inputs and
outputs.

Sequential Processing: RNNs process input sequences sequentially, which

makes them computationally efficient and easy to parallelize.

Flexibility: RNNs can be adapted to a wide range of tasks and input types,
including text, speech, and image sequences.

Improved Accuracy: RNNs have been shown to achieve state-of-the-art

performance on a variety of sequence modeling tasks, including language
modeling, speech recognition, and machine translation.

Disadvantages of Recurrent Neural Network:

Vanishing And Exploding Gradients: RNNs can suffer from the problem
of vanishing or exploding gradients, which can make it difficult to train the
network effectively. This occurs when the gradients of the loss function
with respect to the parameters become very small or very large as they
propagate through time.

Computational Complexity: RNNs can be computationally expensive to

train, especially when dealing with long sequences. This is because the
network has to process each input in sequence, which can be slow.

Difficulty In Capturing Long-Term Dependencies: Although RNNs are

designed to capture information about past inputs, they can struggle to
capture long-term dependencies in the input sequence. This is because the
gradients can become very small as they propagate through time, which
can cause the network to forget important information.

Lack Of Parallelism: RNNs are inherently sequential, which makes it

difficult to parallelize the computation. This can limit the speed and
scalability of the network.

Difficulty In Choosing The Right Architecture: There are many different

variants of RNNs, each with its own advantages and disadvantages.
Choosing the right architecture for a given task can be challenging, and may
require extensive experimentation and tuning.

Difficulty In Interpreting The Output: The output of an RNN can be

difficult to interpret, especially when dealing with complex inputs such as
natural language or audio. This can make it difficult to understand how the
network is making its predictions.

Visual Introduction Deep Learning v21-02
100% (6)
Visual Introduction Deep Learning v21-02
236 pages
SE Unit 3
No ratings yet
SE Unit 3
10 pages
CS964 Data Warehousing and Data Mining
No ratings yet
CS964 Data Warehousing and Data Mining
1 page
Deep Learning (MODULE-3) (1)
No ratings yet
Deep Learning (MODULE-3) (1)
85 pages
Techknowledge DevOps Unit 1
No ratings yet
Techknowledge DevOps Unit 1
15 pages
ML Unit 5
No ratings yet
ML Unit 5
57 pages
Unit 4 Supervised Learning
100% (1)
Unit 4 Supervised Learning
75 pages
Future Skills - An Introduction, General Overview of The Future Skills Sub-Sector-1
No ratings yet
Future Skills - An Introduction, General Overview of The Future Skills Sub-Sector-1
15 pages
16 Marks
No ratings yet
16 Marks
3 pages
Soft Computing UNIT 3
No ratings yet
Soft Computing UNIT 3
10 pages
Csizg514 Mar08 An PDF
No ratings yet
Csizg514 Mar08 An PDF
1 page
RTRP Lab Project
No ratings yet
RTRP Lab Project
13 pages
Enterprise Information Architecture Component Model - Chapter 5
100% (1)
Enterprise Information Architecture Component Model - Chapter 5
27 pages
Deep Neural Network
No ratings yet
Deep Neural Network
12 pages
Ece443 - Wireless Sensor Networks Course Information Sheet: Electronics and Communication Engineering Department
No ratings yet
Ece443 - Wireless Sensor Networks Course Information Sheet: Electronics and Communication Engineering Department
10 pages
Neural
No ratings yet
Neural
35 pages
What is Gradient Based Learning in Deep Learning
No ratings yet
What is Gradient Based Learning in Deep Learning
12 pages
Unit - V Implementation, Testing & Maintenance
No ratings yet
Unit - V Implementation, Testing & Maintenance
60 pages
Associative Memory Networks
No ratings yet
Associative Memory Networks
6 pages
Artificial Intelligence - Based Multiopath Transmission Model For WSN Energy Efficiency
100% (1)
Artificial Intelligence - Based Multiopath Transmission Model For WSN Energy Efficiency
11 pages
TB04 - Soft Computing Ebook PDF
100% (4)
TB04 - Soft Computing Ebook PDF
356 pages
Unit 4 DATA PLACEMENT ON DISKS
No ratings yet
Unit 4 DATA PLACEMENT ON DISKS
23 pages
Unit V Big Data Analytics
No ratings yet
Unit V Big Data Analytics
47 pages
Introduction To Computer Vision
No ratings yet
Introduction To Computer Vision
10 pages
UNIT2
No ratings yet
UNIT2
25 pages
ASHWAni 3D Human Sensing New
No ratings yet
ASHWAni 3D Human Sensing New
1 page
Unit 1 Aktu
No ratings yet
Unit 1 Aktu
26 pages
Data Modelling and Visualization
No ratings yet
Data Modelling and Visualization
31 pages
Digital Visualization
No ratings yet
Digital Visualization
65 pages
Lecture 26-30 Unit 2
No ratings yet
Lecture 26-30 Unit 2
20 pages
Genetic Algorithms Versus Traditional Methods
No ratings yet
Genetic Algorithms Versus Traditional Methods
7 pages
Sepm Notes Module 2
No ratings yet
Sepm Notes Module 2
31 pages
Big data aktu unit 3
No ratings yet
Big data aktu unit 3
90 pages
Unit I R Data Structures
No ratings yet
Unit I R Data Structures
30 pages
Neuro Symbolic Reasoning and Learning: Paulo Shakarian Chitta Baral Gerardo I. Simari Bowen Xi Lahari Pokala
No ratings yet
Neuro Symbolic Reasoning and Learning: Paulo Shakarian Chitta Baral Gerardo I. Simari Bowen Xi Lahari Pokala
125 pages
Internship 7th Sem
No ratings yet
Internship 7th Sem
16 pages
MCQ
No ratings yet
MCQ
11 pages
ML Decode
No ratings yet
ML Decode
130 pages
CS6456-Object Oriented Programming
No ratings yet
CS6456-Object Oriented Programming
15 pages
Distance Based Models
No ratings yet
Distance Based Models
58 pages
Chapter 3-Problem Solving by Searching Part 1
No ratings yet
Chapter 3-Problem Solving by Searching Part 1
80 pages
Lecture 3: Text Processing & Minimum Edit Distance Algorithm
No ratings yet
Lecture 3: Text Processing & Minimum Edit Distance Algorithm
57 pages
unit V
No ratings yet
unit V
67 pages
Umldp Lab Manual
No ratings yet
Umldp Lab Manual
23 pages
Stochastic Gradient Descent - Term Paper
No ratings yet
Stochastic Gradient Descent - Term Paper
8 pages
Features of MapReduce
No ratings yet
Features of MapReduce
4 pages
CS402 Data Mining and Warehousing PDF
No ratings yet
CS402 Data Mining and Warehousing PDF
3 pages
AKTU Notes Machine Learning (ROE083) Unit-1 - UPTU Notes PDF
No ratings yet
AKTU Notes Machine Learning (ROE083) Unit-1 - UPTU Notes PDF
66 pages
Anna University Syllabus Materials and Question Papers - CS9223 ADVANCED SYSTEM SOFTWARE NOTES
No ratings yet
Anna University Syllabus Materials and Question Papers - CS9223 ADVANCED SYSTEM SOFTWARE NOTES
36 pages
ML Question Bank and Sol
No ratings yet
ML Question Bank and Sol
12 pages
Deep Neural Network (DNN)
100% (1)
Deep Neural Network (DNN)
80 pages
Back Propagation
No ratings yet
Back Propagation
56 pages
Unit Iv Web Retrieval and Web Crawling 9
No ratings yet
Unit Iv Web Retrieval and Web Crawling 9
1 page
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
From Everand
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
Robert Johnson
No ratings yet
AppDynamics Third Edition
From Everand
AppDynamics Third Edition
Gerardus Blokdyk
No ratings yet
Unit 5
No ratings yet
Unit 5
76 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
8 pages
Module2 L7 RNN LSTM
No ratings yet
Module2 L7 RNN LSTM
47 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
6 pages
DeepLearning Unit-III
No ratings yet
DeepLearning Unit-III
42 pages
RNN SK
No ratings yet
RNN SK
17 pages
Rogerian Argument_ a.I Generated Misinformation
No ratings yet
Rogerian Argument_ a.I Generated Misinformation
8 pages
Unit 1 PDF
No ratings yet
Unit 1 PDF
135 pages
AIS Assignment
100% (1)
AIS Assignment
18 pages
W1-1 Introduction Class
No ratings yet
W1-1 Introduction Class
34 pages
AI-powered Interview Training System for Job Seekers
No ratings yet
AI-powered Interview Training System for Job Seekers
4 pages
Ai Cat I
No ratings yet
Ai Cat I
3 pages
Fuzzy Means Questions
No ratings yet
Fuzzy Means Questions
2 pages
Artificial Intelligence - Model Question Paper
100% (2)
Artificial Intelligence - Model Question Paper
2 pages
Discussion 27th June 18
No ratings yet
Discussion 27th June 18
2 pages
Class 9 AI Supplement 1
100% (1)
Class 9 AI Supplement 1
141 pages
Revolutionary Approach For Smart Washing Machine
No ratings yet
Revolutionary Approach For Smart Washing Machine
6 pages
Skill-Lync's Automotive Talent Trends Report - 2024
No ratings yet
Skill-Lync's Automotive Talent Trends Report - 2024
27 pages
Cse IV-years Cs-Syllabus Ug r20
No ratings yet
Cse IV-years Cs-Syllabus Ug r20
243 pages
Seven Activities To Engage System Thinking
No ratings yet
Seven Activities To Engage System Thinking
12 pages
Hômework
No ratings yet
Hômework
6 pages
AI Driven Healthcare Predictive Analytics For Disease Diagnosis and Treatment
No ratings yet
AI Driven Healthcare Predictive Analytics For Disease Diagnosis and Treatment
6 pages
Time Table 1st Sem
No ratings yet
Time Table 1st Sem
59 pages
Scoa-Question Bank PDF
No ratings yet
Scoa-Question Bank PDF
8 pages
AI-900+2025
No ratings yet
AI-900+2025
267 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
1 page
Blueprints Visual Scripting For Unreal Engine - Sample Chapter
100% (1)
Blueprints Visual Scripting For Unreal Engine - Sample Chapter
33 pages
Unit 1 QB
No ratings yet
Unit 1 QB
20 pages
Trends-2025-1
No ratings yet
Trends-2025-1
8 pages
Natural Language Processing Project Review-3: Cyber Bullying Detection System Using Sentiment Analysis
No ratings yet
Natural Language Processing Project Review-3: Cyber Bullying Detection System Using Sentiment Analysis
30 pages
IoT Based Automated Traffic Control System - Project Report
No ratings yet
IoT Based Automated Traffic Control System - Project Report
14 pages
Instructions For How To Solve Assignment
No ratings yet
Instructions For How To Solve Assignment
3 pages
6. Naive Bayes
No ratings yet
6. Naive Bayes
26 pages
Kgasc Workshop
No ratings yet
Kgasc Workshop
2 pages