10. Kernel Methods | Project 2: Digit recognition ... https://courses.edx.org/courses/course-v1:MITx+...
Unit 2 Nonlinear Classi�cation,
Linear regression, Collaborative
Course Filtering (2 weeks) Project 2: Digit recognition (Part 1) 10. Kernel Methods
Audit Access Expires May 11, 2020
You lose all access to this course, including your progress, on May 11, 2020.
Upgrade by Apr 1, 2020 to get unlimited access to the course as long as it exists on the site. Upgrade now
10. Kernel Methods
As you can see, implementing a direct mapping to the high-dimensional features is a lot of work (imagine using an even
higher dimensional feature mapping.) This is where the kernel trick becomes useful.
Recall the kernel perceptron algorithm we learned in the lecture. The weights θ can be represented by a linear combination of
features:
n
θ = ∑ α(i) y (i) ϕ (x(i) )
i=1
In the softmax regression fomulation, we can also apply this representation of the weights:
n
θj = ∑ αj y (i) ϕ (x(i) ) .
(i)
i=1
⎡e ⎤
[θ1 ⋅ϕ(x)/τ]−c
⎢
⎢
[θ2 ⋅ϕ(x)/τ]−c ⎥
⎥
⎢ ⎥
1 e
⎢ ⎥
h (x) =
∑kj=1 e [ θ ⋅ϕ(x)/τ]−c
⎢ ⋮ ⎥
⎣ e[θk ⋅ϕ(x)/τ]−c ⎦
j
⎡ e[∑i=1 α1 y ϕ(x )⋅ϕ(x)/τ]−c ⎤
n (i) (i) (i)
⎢ e[∑ni=1 α(i) ⎥
⎢ 2 y ϕ(x )⋅ϕ(x)/τ]−c ⎥
⎢ ⎥
(i) (i)
1
⎢
[∑ni=1 αj y (i) ϕ(x(i) )⋅ϕ(x)/τ]−c ⎢
⎥
⎥
h (x) =
⎢ ⎥
(i)
k
∑j=1 e ⋮
⎣ e[∑i=1 αk y ϕ(x )⋅ϕ(x)/τ]−c ⎦
n (i) (i) (i)
We actually do not need the real mapping ϕ (x) , but the inner product between two features after mapping: ϕ (xi ) ⋅ ϕ (x) ,
where xi is a point in the training set and x is the new data point for which we want to compute the probability. If we can
create a kernel function K (x, y) = ϕ (x) ⋅ ϕ (y), for any two points x and y, we can then kernelize our softmax regression
algorithm.
1 of 6 2020-03-25, 7:33 p.m.
10. Kernel Methods | Project 2: Digit recognition ... https://courses.edx.org/courses/course-v1:MITx+...
You will be working in the �les part1/main.py and part1/kernel.py in this problem
Implementing Polynomial Kernel
1.0/1 point (graded)
In the last section, we explicitly created a cubic feature mapping. Now, suppose we want to map the features into d
dimensional polynomial space,
– – – – – −− −−
ϕ (x) = ⟨x2d , … , x21 , √2xd xd−1 , … , √2xd x1 , √2xd−1 xd−2 , … , √2xd−1 x1 , … , √2x2 x1 , √2cxd , … , √2cx1 , c⟩
Write a function polynomial_kernel that takes in two matrix X and Y and computes the polynomial kernel K (x, y) for
every pair of rows x in X and y in Y .
Available Functions: You have access to the NumPy python library as np
1 def polynomial_kernel(X, Y, c, p):
2 """
3 Compute the polynomial kernel between two matrices X and Y::
4 K(x, y) = (<x, y> + c)^p
5 for each pair of rows x in X and y in Y.
6
7 Args:
8 X - (n, d) NumPy array (n datapoints each with d features)
9 Y - (m, d) NumPy array (m datapoints each with d features)
10 c - a coefficient to trade off high-order and low-order terms (scalar)
11 p - the degree of the polynomial kernel
12
13 Returns:
14 kernel_matrix - (n, m) Numpy array containing the kernel matrix
15 """
Press ESC then TAB or click outside of the code editor to exit
Correct
def polynomial_kernel(X, Y, c, p):
"""
Compute the polynomial kernel between two matrices X and Y::
K(x, y) = (<x, y> + c)^p
for each pair of rows x in X and y in Y.
Args:
X - (n, d) NumPy array (n datapoints each with d features)
Y - (m, d) NumPy array (m datapoints each with d features)
c - an coefficient to trade off high-order and low-order terms (scalar)
p - the degree of the polynomial kernel
Returns:
kernel_matrix - (n, m) Numpy array containing the kernel matrix
"""
K = X @ Y.transpose()
K += c
K **= p
return K
2 of 6 2020-03-25, 7:33 p.m.
10. Kernel Methods | Project 2: Digit recognition ... https://courses.edx.org/courses/course-v1:MITx+...
Test results
See full output
CORRECT
See full output
You have used 1 of 25 attempts
Answers are displayed within the problem
Gaussian RBF Kernel
1.0/1 point (graded)
Another commonly used kernel is the Gaussian RBF kenel. Similarly, write a function rbf_kernel that takes in two matrices
X and Y and computes the RBF kernel K (x, y) for every pair of rows x in X and y in Y .
Available Functions: You have access to the NumPy python library as np
1 def rbf_kernel(X, Y, gamma):
2 """
3 Compute the Gaussian RBF kernel between two matrices X and Y::
4 K(x, y) = exp(-gamma ||x-y||^2)
5 for each pair of rows x in X and y in Y.
6
7 Args:
8 X - (n, d) NumPy array (n datapoints each with d features)
9 Y - (m, d) NumPy array (m datapoints each with d features)
10 gamma - the gamma parameter of gaussian function (scalar)
11
12 Returns:
13 kernel_matrix - (n, m) Numpy array containing the kernel matrix
14 """
15 # YOUR CODE HERE
Press ESC then TAB or click outside of the code editor to exit
Correct
3 of 6 2020-03-25, 7:33 p.m.
10. Kernel Methods | Project 2: Digit recognition ... https://courses.edx.org/courses/course-v1:MITx+...
def rbf_kernel(X, Y, gamma):
"""
Compute the Gaussian RBF kernel between two matrices X and Y::
K(x, y) = exp(-gamma ||x-y||^2)
for each pair of rows x in X and y in Y.
Args:
X - (n, d) NumPy array (n datapoints each with d features)
Y - (m, d) NumPy array (m datapoints each with d features)
gamma - the gamma parameter of gaussian function (scalar)
Returns:
kernel_matrix - (n, m) Numpy array containing the kernel matrix
"""
XTX = np.mat([np.dot(row, row) for row in X]).T
YTY = np.mat([np.dot(row, row) for row in Y]).T
XTX_matrix = np.repeat(XTX, Y.shape[0], axis=1)
YTY_matrix = np.repeat(YTY, X.shape[0], axis=1).T
K = np.asarray((XTX_matrix + YTY_matrix - 2 * (X @ Y.T)), dtype='float64')
K *= - gamma
return np.exp(K, K)
Test results
See full output
CORRECT
See full output
You have used 1 of 25 attempts
Answers are displayed within the problem
Now, try implementing the softmax regression using kernelized features. You will have to rewrite the softmax_regression
function in softmax.py, as well as the auxiliary functions compute_cost_function, compute_probabilities,
run_gradient_descent_iteration.
How does the test error change?
4 of 6 2020-03-25, 7:33 p.m.
10. Kernel Methods | Project 2: Digit recognition ... https://courses.edx.org/courses/course-v1:MITx+...
In this project, you have been familiarized with the MNIST dataset for digit recognition, a popular task in computer vision.
You have implemented a linear regression which turned out to be inadequate for this task. You have also learned how to use
scikit-learn's SVM for binary classi�cation and multiclass classi�cation.
Then, you have implemented your own softmax regression using gradient descent.
Finally, you experimented with di�erent hyperparameters, di�erent labels and di�erent features, including kernelized
features.
In the next project, you will apply neural networks to this task.
Discussion Hide Discussion
Topic: Unit 2 Nonlinear Classi�cation, Linear regression, Collaborative Filtering (2 weeks):Project
2: Digit recognition (Part 1) / 10. Kernel Methods
Add a Post
Show all posts by recent activity
[STAFF] get confused 2
Hi, I completed all answers correctly except the one on SVM=>"Implement C-SVM" .... where I get confused. Can you please reset the attempts count …
[STAFF] RFB kernel answer correct but grader truncates output
6
My answer seems correct (the kernel output is exactly the same as the answer) but for some reason the grader is truncating my output and giving m…
RBF: Submitted same code twice, got error the �rst time and correct the second time 3
For the RBF kernel question, I submitted the code and got an error. But my output looked identical to the grader's output. So I submitted the same c…
[STAFF] Problems with grader for "Gaussian RBF Kernel" 2
Hi, Sta� I've sent four times my code solution to the problem "Gaussian RBF Kernel". I think, the grader has a problem because if the solution is trun…
Computation between matrices of di�erent shapes, how? 4
[relate to RBF Kernel] Sorry for this stupid question, but why it is possible to compute ||x-y||^2 when x and y are matrices of di�erent shapes....? I t…
Implementing Polynomial Kernel 1
I got this correct
The ending feels rushed 1
Basically, the �nal part of the project where it's suggested to implement softmax regression using Kernelized features feels a bit rushed. It's not very…
Correct Answer Marked Wrong?
2
Hi When submitting my polynomial kernel code I get exactly the same output as given by the grader, but it's marked incorrect. Can someone please …
What is the issue that i am getting. Answers are correct but INCORRECT? 4
What is the issue that i am getting. Answers are correct but INCORRECT?
[sta�] How to use kernels?
6
I got all functions correct but still struggling in getting the point of why do we need these kernels? First of all i don't understand where arbitrary Y co…
[STAFF] Gaussian RBF Kernel - please check the lines of code 2
I am going crazy. Everything seems to be ok, the grader gives me half the points, but after several hours I am unable to see where I am missing the o…
softmax regression using kernelized features.
2
Hi Sta�, Please point me to a paper or lecture notes to describe complete softmax regression with kernel algorithm, thanks.
review my answer
2
Please review my answer, and tell me what is wrong
Learn About Veri�ed Certi�cates
5 of 6 2020-03-25, 7:33 p.m.
10. Kernel Methods | Project 2: Digit recognition ... https://courses.edx.org/courses/course-v1:MITx+...
© All Rights Reserved
6 of 6 2020-03-25, 7:33 p.m.