Interaction Lab. Seoul National University of Science and Technology
핵심 딥러닝 입문
chapter 4. RNN
Jeong Jae-Yeop
Interaction Lab., Seoul National University of Science and Technology
■Intro
■Training method
■Code practice
■Conclusion
Agenda
2
Intro
Training method
3
Interaction Lab., Seoul National University of Science and Technology
■What is RNN?
 Reccurent Neural Network
• Sequence data
• 𝑡 : Time
Intro
4
Input Output
Hidden
Interaction Lab., Seoul National University of Science and Technology
■Reccurent architecture
Intro
5
Interaction Lab., Seoul National University of Science and Technology
■Activation function
 Hyperbolic tangent
• 𝑥𝑡 : Input
• 𝑊
𝑥 : Input weight
• 𝑏 : Bias
• ℎ𝑡−1 : Previous output
• 𝑊ℎ : Previous output weight
Intro
6
Training method
Code practice
7
Interaction Lab., Seoul National University of Science and Technology
■Feed forward propagation
 Calculate and store variables sequentially from the input layer to the output layer of the NN
■Backpropagation
 How to calculate gradients for parameters of a NN
Training method
Interaction Lab., Seoul National University of Science and Technology
■Feed forward propagation of RNN
 Deep Neural Network
• 𝑈 = 𝑋𝑊 + 𝐵
• 𝑌 = 𝑓(𝑈)
 RNN
• 𝑈(𝑡)
= 𝑋(𝑡)
𝑊 + 𝑌(𝑡−1)
𝑉 + 𝐵
• 𝑌(𝑡)
= 𝑓(𝑈(𝑡)
)
Training method
9
Interaction Lab., Seoul National University of Science and Technology
■Feed forward propagation of RNN
Training method
10
Input(t) 행렬 곱
행렬 곱
+
Activation
function Next layer
Next point
Weight
Weight
Bias
Output
Interaction Lab., Seoul National University of Science and Technology
■Feed forward propagation of RNN
Training method
11
𝑈(𝑡)
= 𝑥𝑡𝑊𝑥ℎ + ℎ𝑡−1𝑊ℎℎ + 𝑏ℎ
Interaction Lab., Seoul National University of Science and Technology
■Backpropagation of RNN
Training method
12
Interaction Lab., Seoul National University of Science and Technology
■Backpropagation of RNN
 We have to update parameters 𝑊𝑥ℎ, 𝑊ℎℎ, 𝑏
Training method
13
𝑑ℎ𝑡−1
Interaction Lab., Seoul National University of Science and Technology
■BPTT (Backpropagation Through Time)
 As the time scale of time series data increases, the computing resources consumed by
BPTT also increase
 As the time scale increases, the gradient of backpropagation becomes unstable
Training method
14
Interaction Lab., Seoul National University of Science and Technology
■Truncated BPTT
 Data must be entered in order
 Cut the backpropagation connection to an appropriate length
Training method
15
Interaction Lab., Seoul National University of Science and Technology
■Truncated BPTT using mini-batch
 Mini-batch : 2
 1,000 data : 500 / 500
Training method
16
Interaction Lab., Seoul National University of Science and Technology
■Binary addition
 5 = 1 × 22 + 0 × 21 + 1 × 20 ∶ 101
 36 = 1 × 25 + 0 × 24 + 0 × 23 + 0 × 22 +0 × 21 +0 × 20 ∶ 100100
 Input : two randomly selected binary numbers
 Label : sum of two numbers
 Link
Code practice
17
Interaction Lab., Seoul National University of Science and Technology
■Disadvantage of RNN
 Gradient vanishing and Gradient exploding
• LSTM and GRU
• Gradient clipping
Conclusion
Q&A
19

핵심 딥러닝 입문 4장 RNN

  • 1.
    Interaction Lab. SeoulNational University of Science and Technology 핵심 딥러닝 입문 chapter 4. RNN Jeong Jae-Yeop
  • 2.
    Interaction Lab., SeoulNational University of Science and Technology ■Intro ■Training method ■Code practice ■Conclusion Agenda 2
  • 3.
  • 4.
    Interaction Lab., SeoulNational University of Science and Technology ■What is RNN?  Reccurent Neural Network • Sequence data • 𝑡 : Time Intro 4 Input Output Hidden
  • 5.
    Interaction Lab., SeoulNational University of Science and Technology ■Reccurent architecture Intro 5
  • 6.
    Interaction Lab., SeoulNational University of Science and Technology ■Activation function  Hyperbolic tangent • 𝑥𝑡 : Input • 𝑊 𝑥 : Input weight • 𝑏 : Bias • ℎ𝑡−1 : Previous output • 𝑊ℎ : Previous output weight Intro 6
  • 7.
  • 8.
    Interaction Lab., SeoulNational University of Science and Technology ■Feed forward propagation  Calculate and store variables sequentially from the input layer to the output layer of the NN ■Backpropagation  How to calculate gradients for parameters of a NN Training method
  • 9.
    Interaction Lab., SeoulNational University of Science and Technology ■Feed forward propagation of RNN  Deep Neural Network • 𝑈 = 𝑋𝑊 + 𝐵 • 𝑌 = 𝑓(𝑈)  RNN • 𝑈(𝑡) = 𝑋(𝑡) 𝑊 + 𝑌(𝑡−1) 𝑉 + 𝐵 • 𝑌(𝑡) = 𝑓(𝑈(𝑡) ) Training method 9
  • 10.
    Interaction Lab., SeoulNational University of Science and Technology ■Feed forward propagation of RNN Training method 10 Input(t) 행렬 곱 행렬 곱 + Activation function Next layer Next point Weight Weight Bias Output
  • 11.
    Interaction Lab., SeoulNational University of Science and Technology ■Feed forward propagation of RNN Training method 11 𝑈(𝑡) = 𝑥𝑡𝑊𝑥ℎ + ℎ𝑡−1𝑊ℎℎ + 𝑏ℎ
  • 12.
    Interaction Lab., SeoulNational University of Science and Technology ■Backpropagation of RNN Training method 12
  • 13.
    Interaction Lab., SeoulNational University of Science and Technology ■Backpropagation of RNN  We have to update parameters 𝑊𝑥ℎ, 𝑊ℎℎ, 𝑏 Training method 13 𝑑ℎ𝑡−1
  • 14.
    Interaction Lab., SeoulNational University of Science and Technology ■BPTT (Backpropagation Through Time)  As the time scale of time series data increases, the computing resources consumed by BPTT also increase  As the time scale increases, the gradient of backpropagation becomes unstable Training method 14
  • 15.
    Interaction Lab., SeoulNational University of Science and Technology ■Truncated BPTT  Data must be entered in order  Cut the backpropagation connection to an appropriate length Training method 15
  • 16.
    Interaction Lab., SeoulNational University of Science and Technology ■Truncated BPTT using mini-batch  Mini-batch : 2  1,000 data : 500 / 500 Training method 16
  • 17.
    Interaction Lab., SeoulNational University of Science and Technology ■Binary addition  5 = 1 × 22 + 0 × 21 + 1 × 20 ∶ 101  36 = 1 × 25 + 0 × 24 + 0 × 23 + 0 × 22 +0 × 21 +0 × 20 ∶ 100100  Input : two randomly selected binary numbers  Label : sum of two numbers  Link Code practice 17
  • 18.
    Interaction Lab., SeoulNational University of Science and Technology ■Disadvantage of RNN  Gradient vanishing and Gradient exploding • LSTM and GRU • Gradient clipping Conclusion
  • 19.