CERN Deep Learning and Vision
CERN Deep Learning and Vision
Jon Shlens
Google Research
28 April 2017
Agenda
4. Conclusions
Agenda
4. Conclusions
The hubris of artificial intelligence
http://dspace.mit.edu/handle/1721.1/6125
‘Simple’ problems proved most difficult.
cat?
Machine learning applied everywhere.
classes
• electric ray
• barracuda
• coho salmon
• tench
• goldfish
• sawfish
• smalltooth sawfish
• guitarfish
• stingray
• roughtail stingray
• ...
ImageNet 2011
Compressed Fisher kernel + SVM Xerox Research Center Europe
SIFT bag-of-words + VQ + SVM University of Amsterdam & University of
Trento
SIFT + ? ISI Lab, Tokyo University
ImageNet 2012
Deep convolutional neural network University of Toronto
Discriminatively trained DPMs University of Oxford
Fisher-based SIFT features + SVM ISI Lab, Tokyo University
Examples of artificial vision in action
Good fine-grain classification.
• fine-grain classification
hibiscus
Good generalization. dahila
• generalization
Both meal
Sensible errors. recognized as “meal” meal
• sensible errors
snake dog
4. Conclusions
History of techniques in ImageNet Challenge
ImageNet 2010
Locality constrained linear coding + SVM NEC & UIUC
Fisher kernel + SVM Xerox Research Center Europe
SIFT features + LI2C Nanyang Technological Institute
SIFT features + k-Nearest Neighbors Laboratoire d'Informatique de Grenoble
Color features + canonical correlation analysis National Institute of Informatics, Tokyo
ImageNet 2011
Compressed Fisher kernel + SVM Xerox Research Center Europe
SIFT bag-of-words + VQ + SVM University of Amsterdam & University of
Trento
SIFT + ? ISI Lab, Tokyo University
ImageNet 2012
Deep convolutional neural network University of Toronto
Discriminatively trained DPMs University of Oxford
Fisher-based SIFT features + SVM ISI Lab, Tokyo University
Deep convolutional neural networks
“cat”
Loosely
Loosely inspired
based onby(what
(whatlittle)
little)
we
we know
know about
about the
the brain
brain
“cat”
• no recurrence or feedback *
• no dynamics or state *
• no biophysics
f (z) = max(0, z)
The perceptron: a probabilistic model for information storage and organization in the brain.
F Rosenblatt (1958)
Employing a network for a task.
“dog”
y = f (f (...)) y
label of node j
Example: how to classify with a network
exp(yj )
P (j) = P
j exp(yj )
1
0.75
y 0.5
0.25
0
cat dog car truck cow bicycle cat dog car truck cow bicycle
label of node j
@ loss
@ wi y
Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences
P Werbos (1974)
Learning Internal Representations by Error Propagation.
D Rumelhart, G Hinton, R Williams, James L. McClelland et al (1986)
Optimization is highly non-convex.
loss
weight 1 weight 2
“4”
http://yann.lecun.com/exdb/mnist/
# weights = N x M = 1000
# weights = N x P2 = 78400
handwritten
zip codes
P=28
…
translation
cropping
dilation
contrast
rotation
scale
brightness
…
0 0 0
0 1 0
0 0 0
https://docs.gimp.org/en/plug-in-convmatrix.html
original filter (5 x 5) blur
https://docs.gimp.org/en/plug-in-convmatrix.html
original filter (5 x 5) sharpen
https://docs.gimp.org/en/plug-in-convmatrix.html
original filter (3 x 3) vertical edge detector
https://docs.gimp.org/en/plug-in-convmatrix.html
original filter (3 x 3) all edge detector
https://docs.gimp.org/en/plug-in-convmatrix.html
interlude for convolutions
Multi-layer perceptron on MNIST.
“4”
logistic classifier (M=10)
# weights = N x M = 1000
# weights = N x P2 = 78400
handwritten
zip codes
P=28
“4”
logistic classifier (M=10)
# weights = N x M x K= 1000 K
# weights = N x F2 = 2500
F=5
handwritten
zip codes F=5
P=28
grayscale image
input depth
input depth
RGB image
Generalizing convolutions in depth.
edge detector
filter bank
output
depth
output depth
convolutional
network
• input and output depth are arbitrary parameters and not equal.
• Convolutional neural networks operate with depths up to 1024.
The first convolutional neural network.
“4”
logistic classifier (M=10)
convolutional (N=12)
convolutional (N=12)
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
S Ioffe and C Szegedy (2015)
4. Conclusions
Covariate shifts are problematic in machine learning
time = 1
time = N
network.
50%
adaptation or whitening is
impractical in an online time
setting.
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
S Ioffe and C Szegedy (2015)
Previous method for addressing covariate shifts
time = 1
• Adagrad time = N
layer i
• building invariances
through normalization
time = 1
• regularizing the network
time = N
(e.g. dropout, maxout)
I Goodfellow et al (2013)
N Srivastava et al. (2014)
Mitigate covariate shift via batch normalization.
85%
50%
15%
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
S Ioffe and C Szegedy (2015)
Batch normalization speeds up training enormously.
number of mini-batches
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
S Ioffe and C Szegedy (2015)
Agenda
4. Conclusions
Switching to other types of gradients
An important distinction:
• the former provides an update that “lives” in weight space
• the latter provides an update that “lives” in image space
Gradient propagation to find responsible pixels
layer 3
layer 5
Inception-v3
“dog”
http://mscoco.org
Gradient propagation for distorting images.
“dog”
• Apply gradient distortion, feed back the distorted image into the
network and iterate.
“dog”
@ loss
which pixels are sensitive to the label
@ image
Inception-v3
“dog”
4. Conclusions
Quick Start Guide
Online resources:
http://www.tensorflow.org
http://cs231n.github.io/convolutional-networks/
Google Brain Residency Program
g.co/brainresidency
Google Brain Residency Program
g.co/brainresidency