06 - LLM
06 - LLM
1
ut
feedba
w = w ……
ck i-
W i W i+1
1
i
weigh
ts
xi X xi x i+
i-1
Inpu 1
t
Types of
RNN
There are four types of Recurrent Neural Networks:
1. One to One
2. One to Many
3. Many to One
4. Many to Many 2. One to Many
This type of neural network has a
1. One to One single input and multiple outputs. An
example of this is the image
This type of caption.
neural Single Output
network is known
as the Vanilla
yi yi yi yi
Neural Network.
It's used for w
general w w w
i
machine i i
i
learning
problems, xi
which has a single xi
input and a Single Input
Single Input
single
Types of RNN
3. Many to One (contd) 3. Many to Many
This RNN takes a sequence of inputs This RNN takes a sequence of inputs
and generates a single output. and generates a sequence of outputs.
Sentiment analysis is a good Machine translation is one of the
example of this kind of network examples like word to voice.
where a given sentence can be
classified as expressing positive or
Multiple Output
negative sentiments.
Single Output
yi yi yi yi
w w w w w w
i i i i i
i
xi xi xi xi xi xi
I1 I2
Most Popular application of RNN
RNNs are most effective in handling sequential type of data across
different domains. In the healthcare and pharmacy sectors specifically,
RNNs have shown promise in patient care management, treatment
personalization, and predictive analytics, which are crucial for both clinical
research and drug discovery. Their ability to process sequential and time-
seriesof
Some data
the makes them invaluable
most popular in these
applications fields.
of RNNs include:
state / σ σ tanh σ
weight
ht-1 ht
Input
(Xt)
LSTM
1. Forget Gate
(contd)
Step 1: It determines what details that should be removed from the
current
The first input . the LSTM is to decide which information should be omitted
step in
from the cell in that particular time step. The sigmoid function (σ)
determines this. It looks at the previous state (ht-1) along with the current
input Xt and computes the function ft . The function ft decides what
information from previous state to retain and what to forget that is not
important or relevant.
Consider the following two sentences:
Let the output of (ht-1) be “Alice is good in Physics. John, on the other
hand, is good at Chemistry.”
Let the current input at Xt be “He told me yesterday over the phone that
he had served as the captain of his college football team.”
With the current input at Xt , the input gate analyzes the important
information — John plays football, and the fact that he was the captain of
his college team is important.
“He told me yesterday over the phone” is less important; hence it might
be dropped. This process of adding controlled information is done via
LSTM
3. Output Gate
(contd)
Step 3: Decide What Part of the Current Cell State Makes It to the Output
The third step is to decide what the output will be. The sigmoid layer,
decides what parts of the cell state make it to the output. Then, the cell
state through tanh push the values to be between -1 and 1 and multiply it
by the output of the sigmoid gate.
This stage thus decides finally what information from current state to
retain.
wrt example sentences :
Let’s try to predict the next word in the sentence: “John played
tremendously well against the opponent and won for his team. For
his contributions, __???__ was awarded player of the match.”
There could be other choices toot , for the empty space. However ,
since the context is revolving around John . So, “John” could be the
best output after contributions...
LSTM
• The final LSTM DNN(contd)
is made up of cells that are connected in a chain
structure as shown below. The cells store information, whereas the
gates manipulate memory.
Bidirectional LSTMs
• In Bidirectional Recurrent Neural
Networks (BRNN) each training
sequence consists of forwards
and backwards to two
independent recurrent nets, both
of which are coupled to the same
output layer. This means that the
BRNN has comprehensive,
sequential knowledge about all
points before and after each
point in a given sequence.
LSTM applications
LSTM has a number of well-known applications, including:
• Image captioning
• Machine translation
• Language modeling
• Handwriting generation
• Question answering chatbots
1. Obama-RNN
Here the author used RNN to generate hypothetical political speeches
given by Barrack Obama. Taking in over 4.3 MB / 730,895 words of text
written by Obama’s speech writers as input, the model generates
multiple versions with a wide range of topics including jobs, war on
terrorism, democracy, China.
https://medium.com/@
samim/obama-rnn-machine-generated-political-speeches-c8abd18a2ea0
2. Harry Potter
Here the author trained an LSTM Recurrent Neural Network on the first 4
Harry Potter books. The model is asked to produce a chapter based on
what it learned.
https://
Deep Learning is best suited for
COMPLEX PROBLEMS
such as Image recognition, Speech
recognition, or Natural Language
Processing,
PROVIDED
you have enough data, computing power,
and PATIENCE.
Thanks