Convolutional Neural Networks(CNN) / Stanford cs231n 2017 lecture 5 / MLAI@UOS Lab Meeting

Convolutional Neural Networks
cs231n 2017s, lecture5 review (based on video & notes)
서울시립대학교 인공지능학과
Machine Learning and Artificial Intelligence(MLAI) 연구실
통계학과 오창대
2021 / 07 / 09
1
MLAI Lab Meeting, 2021f
LINK
LINK
Presenter:
MLAI:

Plan for today
1. Background
2. Overview
3. CNN components
(conv / pooling / fc)
4. Other topics
(vis / transfer learning / other domains)
5. Discussion
2
*Focused on image classification

1. Background
Hubel & Wiesel (1959 ~ )
3
• 고양이를 대상으로 시각피질에 관한연구
• Edge, blobs 등의 자극에 대한 신경반응 체크
• 피질 내부의 Topographical mapping
• 뉴런들의 계층적 구조

4
• Simple cell – learnable
• Complex cell – fixed
• S/C cell을 교차시켜 반복 -> 계층적으로 구축
• Fully unsupervised
• Translation invariance
Original paper: https://www.rctn.org/bruno/public/papers/Fukushima1980.pdf
1. Background

5
Supervising the neocognitron
• 마지막 c layer 뒤에
class decision layer 추가
• End-to-End learning
via backpropagation
The first CNN
1. Background

6
Huge adoption in computer vision applications • Classification
• Image Retrieval
• Detection
• Segmentation
• Visual Tracking
• Pose Estimation
• Video Recognition
• Reinforcement Learning
• Image Captioning
• Deep Dream
• Style Transfer
• Image Generation
1. Background

7
Regular Neural Networks for image classification
• 3차원 텐서형태(W,H,C)의 이미지가
1차원 벡터형태로 flatten 되어 네트워크에 input됨.
• Layer의 각 뉴런들은 입력 이미지의 모든 픽셀들과 연결됨. (fully-connected)
• 다른 위치에서 등장하는 동일한 pattern에 대해 별도의 가중치로 훈련
Spatial한 정보를 잃음
Input 이미지의 scale이 커지면 요구되는 parameter 수가 급격히 증가
2. Overview

8
• Spatial structure를 보존!
: 각 convolutional layer는
3D to 3D transformation function이다.
( batch 차원까지 고려 시 4D to 4D )
2. Overview

9
2. Overview
높은 수준의 추상적인 특성을
낮은 수준의 구상적인 특성들을 조합하여 학습.
(hierarchical representation learning)
Image from https://www.deeplearningbook.org/contents/intro.html

10
2. Overview
• [Conv + RELU] with POOL 블록의 반복 : feature extraction
• 출력에 다다를수록 공간차원은 축소되며, 채널차원은 깊어짐
• 추출된 feature를 FC layer가 이용하여 class score로 매핑

11
• 작은 filter를 이미지 전체에 sliding 하며
element-wise product를 수행하고
결과값들을 모두 더함.
Note that the connectivity is local in 2D space,
(but full along the input depth)
용어 체크!!
▪ Stride
▪ Padding
▪ Feature Map
▪ Activation Map
3.1. Convolutional layer
각 필터 가중치를 input 피쳐맵 전체에 공유(parameter sharing)
- Spatial 차원으로는 locally correlated feature 가 학습됨
- Depth 차원으로는 independent feature 가 학습됨
▪ Filter
▪ Kernel
▪ Channel
▪ Receptive field
Image from https://developer.nvidia.com/discover/convolution
3. CNN components

12
• 각 filter가 서로 다른 pattern을
detect하도록 학습되기를 원함.
• 생성되는 각 채널의 feature map이
서로 다른 특징에 대해 activate 되기를 원함
1. Output channels
2. Kernels size
3. Stride
4. padding
Hyperparameters
• 한 뉴런의 receptive field를 결정
• 일반적으로 하나의 큰 filter를 사용하는 것 보다
여러 개의 작은 filter를 연달아 사용하는 것이 더
좋음. (메모리사용 및 성능측면에서)
•Image from https://www.slideshare.net/modulabs/2-cnn-rnn
3. CNN components

13
• Filter를 슬라이딩 할 보폭
1. Output channels
2. Kernels size
3. Stride
4. padding
Hyperparameters
Image from https://deepai.org/machine-learning-glossary-and-terms/padding
• Preserve size spatially
• 모서리/끝부분의 픽셀들에 대해서도 공평한 접근
3. CNN components

14
Output Volume & Parameter 개수 계산
3. CNN components

15
다양한 종류의 convolution
1 by 1 conv Dilated conv Transposed conv
https://github.com/vdumoulin/conv_arithmetic
• Depth-wise dot product
• 채널 수 조절
• Activation function을 붙여
비선형성 추가
3. CNN components
• 유효 receptive field를 넓힘
• Merge spatial information
more aggressively with
fewer layers
• Upsampling이 필요한 경우
(segmentation,
image generation)

16
3.2. (local) Pooling
3. CNN components
• 공간차원의 축소(해상도 감소)
• 이후층들의 파라미터 수 감소
요구되는 계산량을 감소시키고
Input의 작은 변화에 robust하게 만듬
• 일반적으로 max pooling이 가
장 낫다고 알려져 있으나 다양한
pooling 방법들 존재
• Avg, p-norm, parametric, ...
• Global pooling? :: link
Large stride의 conv로 대체될 수 있음.
동일하게 Downsampling을 수행하나, 그 과정은 전혀 다름.
Image from http://taewan.kim/post/cnn/

17
3.3. Fully Connected Layer
• Conv+ReLU+Pooling 블록을 반복해서 거쳐
정제된 feature를 input받아 class score로 매핑
• GAP/GMP등의 Global pooling을 통해 대체하거나,
conv layer로 대체가능.
3. CNN components

18
Summing up
• CONV, ReLU(Activation), POOL 블록들을 stack
• Filter size를 줄이고 layer를 더 깊게 쌓는 추세
• POOL / FC 레이어들을 제거하고
CONV만을(with activation) 사용하는 추세
• 전통적인 구조는 다음과 같음
[(CONV+ReLU)*N + POOL]*M + (FC+ReLU)*K + Softmax
conv에서는 기본적으로 padding을 추가하여 spatial shape 유지,
pooling을 통해서만 downsampling
• 단순히 블록들을 순차적으로 이어붙이는 구조에서 더 나아가
GoogLeNet(net in net), ResNet(connection)등이
큰 개선을 이끌었음

19
Visualizing
4. Other topics
Filters
Activation maps
Embedding the codes
Notice that the similarities are more often class-based and
semantic rather than pixel and color-based.

20
Transfer Learning
• 나만의 멋진 모델을 짜서 scratch부터 훈련시켜보자!!?
• 우리는 데이터에 대해 항상 Hungry 함
• 모델의 규모가 커질수록 기나긴 training time을 요구
4. Other topics
Image from here
• 잘 훈련된 SOTA모델을 가져다쓰자!
• Task 혹은 dataset의 유사성에 따라 전이학습과정에 차이가 있음.

21
CNN for text / signal
Using 1D Convolution !!
4. Other topics
Image from https://www.sciencedirect.com/science/article/pii/S0888327020307846
https://arxiv.org/abs/1408.5882

22
Appendix
Behavior of the intermediate layers
Image from https://deeplearning.cs.cmu.edu/S21/document/slides/lec19.representations.pdf
• 시각화를 위해 각 layer의 output space를 PCA를 통해 2차원으로 축소시켜 표현
• 학습이 진행됨에 따라 상위층의 output feature가 linearly separable 해지고 있음 !!
(learn disentangled representation)

24
Changdae Oh
bnormal16@naver.com
https://velog.io/@changdaeoh
https://github.com/changdaeoh

Convolutional Neural Networks(CNN) / Stanford cs231n 2017 lecture 5 / MLAI@UOS Lab Meeting

More Related Content

Similar to Convolutional Neural Networks(CNN) / Stanford cs231n 2017 lecture 5 / MLAI@UOS Lab Meeting (20)

Convolutional Neural Networks(CNN) / Stanford cs231n 2017 lecture 5 / MLAI@UOS Lab Meeting