Recurrent Neural Network (RNN) : Tuan Nguyen - AI4E
Recurrent Neural Network (RNN) : Tuan Nguyen - AI4E
(RNN)
Tuan Nguyen - AI4E
Outline
● Motivation for RNN
● Introduction to RNN
● The structure RNN
● Deep RNN
● RNN application
Image classification
Video classification
https://www.youtube.com/watch?v=BmoVg3BCwF0&feature=emb_title
Neural network
Recurrent Neural Network Feed forward neural network
Recurrent Neural Network
Usually drawn as:
Recurrent neural network
y1
h1
x1
One-step t=1
delay
RNN Formula
We can process a sequence of vectors x by
applying a recurrence formula at every time y
step:
RNN
new state old stateinput vector at
some time
some function step x
with parameters
W
RNN Formula
We can process a sequence of vectors x by
applying a recurrence formula at every time y
step:
RNN
RNN
x
Forward C1 C2 C3
y1 y2 y3
h1 h2 h3
x1 h0 x2 h1 x3 h2
Forward
C1 C2 C3
y1 y2 y3
h1 h2 h3
indicates shared
x1 h0 x2 h1 x3 h2 weights
Deep RNN
Same
parameters
at this level
≠
Same
parameters
at this level
Time
Recurrent neural network problem
Character-level language model example
Character-level language model example
Vocabulary:
[h,e,l,o] y
Example training RN
sequence: N
“hello”
x
Character-level language model example
Character-level language model example
Vocabulary:
[h,e,l,o]
Example training
sequence:
“hello”
Character-level language model example
Vocabulary:
[h,e,l,o]
Example training
sequence:
“hello”
Character-level language model example
Character-level language model example
Vocabulary:
[h,e,l,o]
Example training
sequence:
“hello”
Sentiment Classification
Linear
Ignore Ignore
Classifier
h1 h2 hn
h = Sum(…)
h1 hn
h2
RNN
CNN
Image captioning
The
Linear
Classifier
h2
RNN RNN
h1 h2
CNN
Image captioning
The man
Linear Linear
Classifier Classifier
h h
RNN RNN2 RNN3
h h h
1 2 3
CNN
test image
X
test image
x0
<ST
ART
>
<START>
test image
y
0
before:
h = tanh(Wxh * x + Whh * h)
h
Wi 0
h now:
x0
h = tanh(Wxh * x + Whh * h + Wih
<ST
ART
>
v
<START>
test image
y
0
h sample!
0
x0
<STA straw
RT>
<START>
test image
y y
0 1
h h
0 1
x0
<ST
straw
ART
>
<START>
test image
y y
0 1
h h sample!
0 1
x0
<ST stra
hat
ART w
>
<START>
test image
y y y
0 1 2
sample
<END> token
h h h
0 1 2 => finish.
x0
<ST stra
hat
ART w
>
<START>
Dataset
Microsoft COCO
[Tsung-Yi Lin et al.
2014]
mscoco.org
currently:
~120K images
~5 sentences each
Prediction
LSTM
Q&A