0% found this document useful (0 votes)
149 views

Artificial Intelligence: Long Short Term Memory Networks

This document discusses Long Short Term Memory (LSTM) networks. It describes how LSTMs address issues with traditional recurrent neural networks like vanishing gradients. LSTMs use gates to control the flow of information through a cell state. The document provides detailed explanations of the forget gate, input gate, cell state, output gate, and how LSTMs are applied to character-level language modeling. Diagrams and equations are included to illustrate the LSTM architecture and training process.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
149 views

Artificial Intelligence: Long Short Term Memory Networks

This document discusses Long Short Term Memory (LSTM) networks. It describes how LSTMs address issues with traditional recurrent neural networks like vanishing gradients. LSTMs use gates to control the flow of information through a cell state. The document provides detailed explanations of the forget gate, input gate, cell state, output gate, and how LSTMs are applied to character-level language modeling. Diagrams and equations are included to illustrate the LSTM architecture and training process.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Artificial Intelligence

Long Short Term


Memory Networks
Pham Viet Cuong
Dept. Control Engineering & Automation, FEEE
Ho Chi Minh City University of Technology
Recurrent Neural Network

ℎ𝑡 = 𝑡𝑎𝑛ℎ 𝑊ℎℎ ℎ𝑡−1 + 𝑊𝑥ℎ 𝑥𝑡 + 𝑏ℎ

𝑦𝑡 = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥 𝑊ℎ𝑦 ℎ𝑡 + 𝑏𝑦

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 2
Recurrent Neural Network
L

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 3
Long Short Term Memory
✓ RNN limitations:
❖ Vanishing gradient
❖ Exploding gradient
❖ Suffering from long-term dependency

LSTM

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 4
Long Short Term Memory

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 5
Long Short Term Memory
✓ Cell state

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 6
Long Short Term Memory
✓ Gate

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 7
Long Short Term Memory
✓ Forget gate

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 8
Long Short Term Memory
✓ Input gate

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 9
Long Short Term Memory
✓ Cell state update

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 10
Long Short Term Memory
✓ Output gate

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 11
Long Short Term Memory

𝑦𝑡 = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥 𝑊ℎ𝑦 ℎ𝑡 + 𝑏𝑦
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 12
Long Short Term Memory
✓ Example: Language model - Character level
❖ Training sequence: “hello”
❖ Vocabulary: h, e, l, o

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 13
Long Short Term Memory
✓ Example: Language model - Character level
❖ Training sequence: “hello”
❖ Vocabulary: h, e, l, o

▪ xt: 4x1 ht: 3x1 Ct: 3x1


▪ Wf, Wi, WC, Wo: 3x7 𝑦𝑡 = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥 𝑊ℎ𝑦 ℎ𝑡 + 𝑏𝑦
▪ bf, bi, bC, bo: 3x1
▪ Why: 4x3
▪ by: 4x1
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 14

You might also like