Class 01
Class 01
520
Sasha Rakhlin and Andrea Caponnetto and Ryan Rifkin + tomaso poggio
Problem of learning:
a focus for
o modern math
o computer algorithms
o neuroscience
• Bioinformatics
ENGINEERING • Computer vision
• Computer graphics, speech
APPLICATIONS
synthesis, creating a virtual actor
INPUT
f OUTPUT
they are not simply using a black box. The best ones are about
the right formulation of the problem (choice of representation
(inputs, outputs), choice of examples, validate predictivity, do not
datamine)
… f (x) = wx + b
Notes
= data from f
= function f
= approximation of f y
x
Generalization: estimating value of function where
there are no data (good generalization means
predicting the function well; most important is for
empirical or validation error to be a good proxy of the
prediction error)
Classification:
9.520, spring 2006 function is binary
Thus….the key requirement (main focus of learning
theory) to solve the problem of learning from
examples:
generalization (and possibly even consistency).
⎡1 l ⎤
min ⎢ ∑ V ( f ( xi ) − yi ) + λ
2
f K⎥ implies
⎣ i =1
f ∈H l
⎦
f ( x ) = ∑i α i K ( x , x i )
l
⎡1 l ⎤
min ⎢ ∑ V ( f ( xi ) − yi ) + λ
2
f K⎥
⎣ i =1
f ∈H l
⎦
f (x) = ∑i ci K (x, x i ) + b
l
x1
f
Theory summary
In the course we will introduce
• Related topics
• Applications
S
9.520, spring 2006 y
Syllabus
INPUT OUTPUT
Bioinformatics
Artificial Markets
Object categorization
Object identification
Image analysis
Graphics
Text Classification
…..
9.520, spring 2006
Bioinformatics application: predicting type of
cancer from DNA chips signals
Learning from examples paradigm
Prediction
Statistical Learning Prediction
Algorithm
Examples
New sample
INPUT OUTPUT
Bioinformatics
Artificial Markets
Object categorization
Object identification
Image analysis
Graphics
Text Classification
…..
9.520, spring 2006
Face identification: example
INPUT OUTPUT
Bioinformatics
Artificial Markets
Object categorization
Object identification
Image analysis
Graphics
Text Classification
…..
9.520, spring 2006
System Architecture
Preprocessing with
overcomplete
TRAINING
dictionary of Haar
wavelets Data Base
QP Solver
SVM Classifier
9.520, spring 2006 Sung, Poggio 1994; Papageorgiou and Poggio, 1998
People classification/detection: training
the system
... ...
MPEG
Constantine Papageorgiou
People classification/detection: training the
system
... ...
... ...
... ...
Project Timeline
Construction of Automatic Recognition of Automatic Scene
the StreetScenes Learning of object 10 Object Description
Database specific features Categories
or parts
INPUT OUTPUT
Object identification
Object categorization
Image analysis
Graphics
Finance
Bioinformatics
…
9.520, spring 2006
Image Analysis
12
10
85
79
73
67
61
55
49
43
37
31
25
19
13
7
1
INPUT OUTPUT
Bioinformatics
Artificial Markets
Object categorization
Object identification
Image analysis
Image synthesis, eg Graphics
Text Classification
…..
9.520, spring 2003
Image Synthesis
Θ = 0° view ⇒
Θ = 45° view ⇒
• Gender classification
• Face inversion effect : experience, viewpoint, other-race, configural
vs. featural representation
• Binding problem, no need for oscillations…
Neural Correlate of Categorization (NCC)
Category
9.520, spring 2006 boundary
Categorization task
.
.
.
. (Match)
Fixation
500 ms. Sample
600 ms. Delay .
1000 .
ms.
Test .
(Nonmatch)
Delay
Test
(Match)
10
4
Cat 100%
Cat 80%
Cat 60%
1
-500 0 500 1000 1500 2000
Time from sample stimulus onset (ms)
D. Freedman + E. Miller + M.
Riesenhuber+T. Poggio (Science,
9.520, spring 2006 2001)
The model fits many physiological data,
predicts several new ones…
Image
Interval
Image-Mask
Mask
1/f noise
20 msec
30 msec
We will see later why this is unusual and interesting for learning
theory!