ICEBREAKER
Phạm Nguyễn Anh Thư
@thu.phamhcmut
Let’s get to know each other!
Record shop
_____ ___ ____
Thật bất ngờ
___ ____ ____
Em dạo này
Tình đắng như ly cà phê
____ _____ ____ _ __ ____
___ ___ ___ ___ ___ ______
Để Mị nói cho mà nghe
____ ____ ____ ___ __ ____ ____
Đâu cần một bài ca tình yêu
how-to-AI Series: Unlock Potential
Day 3: Take UP Convolutional
Neural Network
Nguyễn Luật Gia Khôi
@giakhoi.nguyenluat
Nguyễn Thế Bình
@binh.nguyen288
Outline
1. Break the ice
2. Convolutional Neural Network
3. Why using CNN instead of NN?
4. Demo code
5. Kaggle Challenge
Convolutional
Neural Network
1 -1 -1 -1 1
1 -1 1 -1 1
1 -1 -1 -1 1
1 1 1 -1 1
1 1 1 -1 1
1 1 1 -1 1
1 -1 -1 1 1
1 -1 -1 -1 1
1 -1 1 -1 1
1 -1 -1 -1 1
1 1 1 -1 1
1 1 1 -1 1
1 1 1 -1 1
1 -1 -1 1 1
● -1: black
● 1: white
Matrix representation
1 -1 -1 -1 1
1 -1 1 -1 1
1 -1 -1 -1 1
1 1 1 -1 1
1 1 1 -1 1
1 1 1 -1 1
1 -1 -1 1 1
1 -1 -1 -1 1
1 -1 1 -1 1
1 -1 -1 -1 1
1 1 1 -1 1
1 1 1 -1 1
1 1 1 -1 1
1 -1 -1 1 1
-1 -1 -1 1 1
-1 1 -1 1 1
-1 -1 -1 1 1
1 1 -1 1 1
1 1 -1 1 1
1 -1 1 1 1
-1 1 1 1 1
-1 -1 -1 1 1
-1 1 -1 1 1
-1 -1 -1 1 1
1 1 -1 1 1
1 1 -1 1 1
1 -1 1 1 1
-1 1 1 1 1
1 1 -1 1 1
1 -1 1 -1 1
1 -1 -1 1 1
1 1 -1 1 1
1 1 -1 1 1
1 1 -1 1 1
1 1 -1 1 1
1 1 -1 1 1
1 -1 1 -1 1
1 -1 -1 1 1
1 1 -1 1 1
1 1 -1 1 1
1 1 -1 1 1
1 1 -1 1 1
ANN can solve this variety in digits
1 -1 -1 -1 1
1 -1 1 -1 1
1 -1 -1 -1 1
1 1 1 -1 1
1 1 1 -1 1
1 1 1 -1 1
1 -1 -1 1 1
1
-1
-1
-1
1
1
-1
...
1
-1
1
1
-1
-1
1
1
x2
x33
x32
x3
x4
x34
x1
x35
...
h2
x33
x32
h3
h4
x34
h1
x35
...
h2
h8
h3
h9
h1
h10
...
1
9
8
2
3
0
0.01
0.92
0.003
0.008
0.015
0.02
WAIT A SECOND, how about this?
Image size = 1920 x 1080 x 3
Input layer #neurons = 1920 x 1080 x 3 = 6 million
Hidden layer #neurons = Say you keep it = 4 million
Weights between input and first hidden layer =
6 * 4 = 24 trillion parameters
Disadvantages of ANN in image classification
- Computationally expensive.
- Treat local pixels same as pixels far apart => no locality concern
- Sensitive to location of the object in the image.
How does human recognize
images?
Eyes
Nose
Ears
Hand
Leg
Head
Body
Koala
Nine
Curves
Long curve
Circular pattern
How can we make computers
recognize these patterns?
1 -1 -1 -1 1
1 -1 1 -1 1
1 -1 -1 -1 1
1 1 1 -1 1
1 1 1 -1 1
1 1 -1 1 1
1 -1 1 1 1
1 -1 -1 -1 1
1 -1 1 -1 1
1 -1 -1 -1 1
1 1 1 -1 1
1 1 1 -1 1
1 1 -1 1 1
1 -1 1 1 1
1 -1 -1 -1 1
1 -1 1 -1 1
1 -1 -1 -1 1
1 1 1 -1 1
1 1 1 -1 1
1 1 -1 1 1
1 -1 1 1 1
1 -1 -1 -1 1
1 -1 1 -1 1
1 -1 -1 -1 1
1 1 1 -1 1
1 1 1 -1 1
1 1 -1 1 1
1 -1 1 1 1
1 -1 -1 -1 1
1 -1 1 -1 1
1 -1 -1 -1 1
1 1 1 -1 1
1 1 1 -1 1
1 1 -1 1 1
1 -1 1 1 1
Circular pattern
filter
Diagonal line
filter
Diagonal line
filter
1 -1 -1 -1 1
1 -1 1 -1 1
1 -1 -1 -1 1
1 1 1 -1 1
1 1 1 -1 1
1 1 -1 1 1
1 -1 1 1 1
-1 -1 -1
-1 1 -1
-1 -1 -1
* =
-1 9 -1
-5 1 -3
-3 3 -3
-5 -1 -5
-3 -5 -3
Circular pattern
filter
-1 x -1 + -1 x -1 + … + -1 x -1 + 1 x 1 = 9
x
-1 -1 -1
-1 1 -1
-1 -1 -1
*
Circular pattern
filter
=
x
-1 -1 -1
-1 1 -1
-1 -1 -1
*
Circular pattern
filter
=
x
x
-1 -1 -1
-1 1 -1
-1 -1 -1
*
Circular pattern
filter
=
x x
*
Eye
filter
= *
Eye
filter
=
*
Eye
filter
=
x x
x x
x x
*
Eye filter
=
x
*
Nose filter
=
x
x
*
Ear filter
=
*
Head filter
x
=
x x
x
x
X
...
X
...
h2
x8
h3
x9
h1
x10
...
0.01 (Giraffe)
0.80 (Koala)
0.003 (Horse)
0.008 (Buffalo)
0.15 (Bear)
0.02 (Squirrel)
Flatten
Feature extraction Classification
1 -1 -1 -1 1
1 -1 1 -1 1
1 -1 -1 -1 1
1 1 1 -1 1
1 1 1 -1 1
1 1 -1 1 1
1 -1 1 1 1
-1 -1 -1
-1 1 -1
-1 -1 -1
* =
-1 9 -1
-5 1 -3
-3 3 -3
-5 -1 -5
-3 -5 -3
Circular pattern
filter
ReLU
0 9 0
0 1 0
0 3 0
0 0 0
0 0 0
Stride & Padding
stride = 2,
padding = “valid”
stride = 1,
padding = “same”
input
output
input output
0 9 0 0
0 1 0 0
0 3 0 0
0 0 0 0
0 0 0 0
Max pooling
pool_size = (2,2)
stride = 2
9 0
3 0
output_shape = math.floor((input_shape - pool_size) / strides) + 1
Benefits of max pooling
- Reduce feature map size => Reduce computational cost.
- Fewer parameters needed => Reduce overfitting.
- Make model tolerant towards variations and distortions.
Complete pipeline
Conv + ReLU Pooling Conv + ReLU Pooling
w1 x h1x c1 w2 x h2 x c1 w2 x h2 x c2 w3 x h3 x c2
Flatten
Why using CNN instead of NN?
- Can capture spatial features => Connection sparsity => Reduce overfitting
x x
*
Eye filter
=
x
*
Nose filter
=
x
x
*
Ear filter
=
0 0 9 0
0 1 0 0
0 3 0 0
0 0 0 0
0 0 0 0
- Translation invariant.
1 9
3 0
0 0 0 0
0 0 9 0
0 3 0 0
0 0 0 0
0 0 0 0
0 9
3 0
Conv
Conv
Pooling
Pooling
- Location invariant feature detection.
- Parameter sharing.
KAHOOT TIME
DEMO CODE
GDSC AI CHALLENGE
● Challenge: Build a model that can automatically classifies images to their
correct labels.
● Registration form is now open!
● Timeline: 12/12/2021 - 30/12/2021
● Rewards:
○ First prize: 1.000.000 VND
○ Second prize: 700.000 VND
○ Third prize: 500.000 VND
○ 2 AI potential prizes: 400.000 VND
GDSC AI CHALLENGE
Q&A
Feedback form
https://tinyurl.com/37kk6dkw
Thank You!

Day 3 take up convolutional neural network

  • 7.
    ICEBREAKER Phạm Nguyễn AnhThư @thu.phamhcmut Let’s get to know each other!
  • 8.
    Record shop _____ _______ Thật bất ngờ ___ ____ ____ Em dạo này Tình đắng như ly cà phê ____ _____ ____ _ __ ____ ___ ___ ___ ___ ___ ______ Để Mị nói cho mà nghe ____ ____ ____ ___ __ ____ ____ Đâu cần một bài ca tình yêu
  • 9.
    how-to-AI Series: UnlockPotential Day 3: Take UP Convolutional Neural Network Nguyễn Luật Gia Khôi @giakhoi.nguyenluat Nguyễn Thế Bình @binh.nguyen288
  • 10.
    Outline 1. Break theice 2. Convolutional Neural Network 3. Why using CNN instead of NN? 4. Demo code 5. Kaggle Challenge
  • 11.
  • 12.
    1 -1 -1-1 1 1 -1 1 -1 1 1 -1 -1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 -1 -1 1 1 1 -1 -1 -1 1 1 -1 1 -1 1 1 -1 -1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 -1 -1 1 1 ● -1: black ● 1: white Matrix representation
  • 13.
    1 -1 -1-1 1 1 -1 1 -1 1 1 -1 -1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 -1 -1 1 1 1 -1 -1 -1 1 1 -1 1 -1 1 1 -1 -1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 -1 -1 1 1 -1 -1 -1 1 1 -1 1 -1 1 1 -1 -1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 1 -1 1 1 1 -1 1 1 1 1 -1 -1 -1 1 1 -1 1 -1 1 1 -1 -1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 1 -1 1 1 1 -1 1 1 1 1 1 1 -1 1 1 1 -1 1 -1 1 1 -1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 1 -1 1 -1 1 1 -1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 ANN can solve this variety in digits
  • 14.
    1 -1 -1-1 1 1 -1 1 -1 1 1 -1 -1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 -1 -1 1 1 1 -1 -1 -1 1 1 -1 ... 1 -1 1 1 -1 -1 1 1 x2 x33 x32 x3 x4 x34 x1 x35 ... h2 x33 x32 h3 h4 x34 h1 x35 ... h2 h8 h3 h9 h1 h10 ... 1 9 8 2 3 0 0.01 0.92 0.003 0.008 0.015 0.02
  • 15.
    WAIT A SECOND,how about this?
  • 16.
    Image size =1920 x 1080 x 3 Input layer #neurons = 1920 x 1080 x 3 = 6 million Hidden layer #neurons = Say you keep it = 4 million Weights between input and first hidden layer = 6 * 4 = 24 trillion parameters
  • 18.
    Disadvantages of ANNin image classification - Computationally expensive. - Treat local pixels same as pixels far apart => no locality concern - Sensitive to location of the object in the image.
  • 19.
    How does humanrecognize images?
  • 20.
  • 21.
  • 22.
    How can wemake computers recognize these patterns?
  • 23.
    1 -1 -1-1 1 1 -1 1 -1 1 1 -1 -1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 1 -1 1 1 1 -1 1 1 1 1 -1 -1 -1 1 1 -1 1 -1 1 1 -1 -1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 1 -1 1 1 1 -1 1 1 1 1 -1 -1 -1 1 1 -1 1 -1 1 1 -1 -1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 1 -1 1 1 1 -1 1 1 1 1 -1 -1 -1 1 1 -1 1 -1 1 1 -1 -1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 1 -1 1 1 1 -1 1 1 1 1 -1 -1 -1 1 1 -1 1 -1 1 1 -1 -1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 1 -1 1 1 1 -1 1 1 1 Circular pattern filter Diagonal line filter Diagonal line filter
  • 24.
    1 -1 -1-1 1 1 -1 1 -1 1 1 -1 -1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 1 -1 1 1 1 -1 1 1 1 -1 -1 -1 -1 1 -1 -1 -1 -1 * = -1 9 -1 -5 1 -3 -3 3 -3 -5 -1 -5 -3 -5 -3 Circular pattern filter -1 x -1 + -1 x -1 + … + -1 x -1 + 1 x 1 = 9
  • 25.
    x -1 -1 -1 -11 -1 -1 -1 -1 * Circular pattern filter = x -1 -1 -1 -1 1 -1 -1 -1 -1 * Circular pattern filter = x x -1 -1 -1 -1 1 -1 -1 -1 -1 * Circular pattern filter =
  • 26.
  • 27.
    x x * Eye filter = x * Nosefilter = x x * Ear filter = * Head filter x = x x
  • 28.
    x x X ... X ... h2 x8 h3 x9 h1 x10 ... 0.01 (Giraffe) 0.80 (Koala) 0.003(Horse) 0.008 (Buffalo) 0.15 (Bear) 0.02 (Squirrel)
  • 29.
  • 30.
    1 -1 -1-1 1 1 -1 1 -1 1 1 -1 -1 -1 1 1 1 1 -1 1 1 1 1 -1 1 1 1 -1 1 1 1 -1 1 1 1 -1 -1 -1 -1 1 -1 -1 -1 -1 * = -1 9 -1 -5 1 -3 -3 3 -3 -5 -1 -5 -3 -5 -3 Circular pattern filter ReLU 0 9 0 0 1 0 0 3 0 0 0 0 0 0 0
  • 31.
    Stride & Padding stride= 2, padding = “valid” stride = 1, padding = “same” input output input output
  • 33.
    0 9 00 0 1 0 0 0 3 0 0 0 0 0 0 0 0 0 0 Max pooling pool_size = (2,2) stride = 2 9 0 3 0 output_shape = math.floor((input_shape - pool_size) / strides) + 1
  • 34.
    Benefits of maxpooling - Reduce feature map size => Reduce computational cost. - Fewer parameters needed => Reduce overfitting. - Make model tolerant towards variations and distortions.
  • 35.
    Complete pipeline Conv +ReLU Pooling Conv + ReLU Pooling w1 x h1x c1 w2 x h2 x c1 w2 x h2 x c2 w3 x h3 x c2 Flatten
  • 36.
    Why using CNNinstead of NN?
  • 37.
    - Can capturespatial features => Connection sparsity => Reduce overfitting x x * Eye filter = x * Nose filter = x x * Ear filter =
  • 38.
    0 0 90 0 1 0 0 0 3 0 0 0 0 0 0 0 0 0 0 - Translation invariant. 1 9 3 0 0 0 0 0 0 0 9 0 0 3 0 0 0 0 0 0 0 0 0 0 0 9 3 0 Conv Conv Pooling Pooling
  • 39.
    - Location invariantfeature detection.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
    ● Challenge: Builda model that can automatically classifies images to their correct labels. ● Registration form is now open! ● Timeline: 12/12/2021 - 30/12/2021 ● Rewards: ○ First prize: 1.000.000 VND ○ Second prize: 700.000 VND ○ Third prize: 500.000 VND ○ 2 AI potential prizes: 400.000 VND GDSC AI CHALLENGE
  • 45.
  • 46.
  • 47.