0% found this document useful (0 votes)
893 views6 pages

Sample For Solution Manual For A First Course in Machine Learning by Rogers & Girolami

This document contains solutions to exercises from Chapter 1 of the book "A First Course in Machine Learning" by Simon Rogers and Mark Girolami. The exercises cover topics related to linear regression, including deriving the normal equations and fitting linear models to Olympic running data. Key points covered include: 1) Deriving the normal equations for linear regression; 2) Fitting linear models to women's 100m Olympic running data and predicting times for 2012 and 2016; 3) Comparing the rates of improvement between men's and women's 100m times.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
893 views6 pages

Sample For Solution Manual For A First Course in Machine Learning by Rogers & Girolami

This document contains solutions to exercises from Chapter 1 of the book "A First Course in Machine Learning" by Simon Rogers and Mark Girolami. The exercises cover topics related to linear regression, including deriving the normal equations and fitting linear models to Olympic running data. Key points covered include: 1) Deriving the normal equations for linear regression; 2) Fitting linear models to women's 100m Olympic running data and predicting times for 2012 and 2016; 3) Comparing the rates of improvement between men's and women's 100m times.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

‫ روی ﻟﯾﻧﮏ زﯾر ﮐﻠﯾﮏ ﮐﻧﯾد و ﯾﺎ ﺑﮫ وﺑﺳﺎﯾت "اﯾﺑوک ﯾﺎب" ﻣراﺟﻌﮫ ﺑﻔرﻣﺎﯾﯾد‬،‫ﺑرای دﺳﺗرﺳﯽ ﺑﮫ ﻧﺳﺧﮫ ﮐﺎﻣل ﺣل اﻟﻣﺳﺎﺋل‬

https://ebookyab.com/solution-manual-a-first-course-in-machine-learning-rogers-girolami/

Chapter 1

EX 1.1. A high positive value of w0 and a small negative value for w1 . These reflect the
high intercept on the t axis (corresponding to the theoretical time winning time at
x = 0 and the small decrese in winning time over the years.

EX 1.2. The following would do the job:

1 % Attributes are stored in Nx1 vector x


2 % Targets are stored in Nx1 vector t
3 xb = mean(x);
4 tb = mean(t);
5 x2b = mean(x.*x);
6 xtb = mean(x.*t);
7 w1 = (xtb − xt*xb)/(x2b−xbˆ2);
8 w0 = tb−w1*xb;
9 % Plot the data
10 plot(x,t,'b.','markersize',25);
11 % Plot the model
12 hold on;
13 plot(x,w0+w1*x,'r','linewidth',2);

EX 1.3. We need to find wT XT Xw. We’ll start with XT X. Multiplying XT by X gives:

" PN PN #
x2 xn1 xn2
X X=T
PN n=1 n1 n=1
PN 2
n=1 xn2 xn1 n=1 xn2

Multiplying this by w gives:

" PN PN #
w0 n=1 x2n1 + w1 n=1 xn1 xn2
X Xw =
T
PN PN
w0 n=1 xn2 xn1 + w1 n=1 x2n2

1
https://ebookyab.com/solution-manual-a-first-course-in-machine-learning-rogers-girolami/

2 CHAPTER 1.

Finally, pre-multiplying this by wT gives:


N N
!
X X
w X Xw
T T
= w0 w0 x2n1 + w1 xn1 xn2 +
n=1 n=1
N N
!
X X
w1 w0 xn2 xn1 + w1 x2n2
n=1 n=1
N
X N
X N
X
= w02 x2n1 + 2w0 w1 xn1 xn2 + w12 x2n2
n=1 n=1 n=1

as required.
EX 1.4. Let’s first work out Xw:
w0 x11 + w1 x12
 
 w0 x21 + w1 x22 
Xw =  ..
 
.

 
w0 xN 1 + w1 xN 2

Therefore

(Xw)T = [w0 x11 + w1 x12 , w0 x21 + w1 x22 , . . . , w0 xN 1 + w1 xN 2 ]

Finally, work out wT XT :

wT XT = [w0 x11 + w1 x12 , w0 x21 + w1 x22 , . . . , w0 xN 1 + w1 xN 2 ]

as required.
EX 1.5. Starting with n xn tn . The result of this is a column vector of the same size as x
P
(2 × 1). Now, using the definition of X,
 
x11 , x21 , . . . , xN 1
XT =
x12 , x22 , . . . , xN 2

(which is a 2 × N vector). Multiplying this by t gives a 2 × 1 vector that looks like


this: " P #
N
x t
n1 n
XT t = Pn=1 N
n=1 xn2 tn

which is n xn tn as required. The second example, XT Xw. We already know what


P
XT Xw is (Exercise 1.3)
" PN PN #
w0 n=1 x2n1 + w1 n=1 xn1 xn2
X Xw =
T
PN PN
w0 n=1 xn2 xn1 + w1 n=1 x2n2

Now, xn xT
n is the following matrix:

x2n1
 
xn1 xn2
xn xT =
n xn2 xn1 x2n2
https://ebookyab.com/solution-manual-a-first-course-in-machine-learning-rogers-girolami/

Multiplying this by w gives:

w0 x2n1 + w1 xn1 xn2


 
nw =
xn xT
w0 xn2 xn1 + w1 x2n2

Summing over the N terms leads us to the matrix we derived previously.


EX 1.6. Code below:

1 %% Women's 100m data


2 % Load all Olympic data
3 load olympics;
4 % Copy the necessary variables
5 x = female100(:,1); % Olympic year
6 t = female100(:,2); % Winning time
7 % Augment x
8 X = [repmat(1,size(x)) x];
9 % Get solution
10 w = inv(X'*X)*X'*t;

The fitted model is:


t = 40.9242 − 0.0151x

EX 1.7. Plugging 2012 and 2016 into the above expression yields winning times of 10.5997
and 10.5394 respectively.
EX 1.8. The men’s model is:
t = 36.4165 − 0.0133x
The women’s model is:
t = 40.9242 − 0.0151x
The women’s time is decreasing faster than the men’s. Therefore, the women will
be faster at the first Olympics after the x that gives identical winning times:

40.9242 − 0.0151x = 36.4165 − 0.0133x


x = 2589

The next Olympic year after 2589 is (assuming they continue to be held every four
years) is the year 2592. The winning times are the unrealistically fast 1.8580 seconds
and 1.8628 seconds for women and men respectively.
EX 1.9. Code below (synthdata cv.m):

1 clear all;close all;


2 load synthdata
3
4 % Augment x
5 X = repmat(1,size(x));
6 for k = 1:4
7 X = [X x.ˆk];
8 end
9
https://ebookyab.com/solution-manual-a-first-course-in-machine-learning-rogers-girolami/

4 CHAPTER 1.

10 % Fit the model


11 w = inv(X'*X)*X'*t;
12
13 % Randomise the data order
14 N = size(X,1);
15 order = randperm(N);
16 sizes = repmat(floor(N/10),1,10);
17 sizes( end) = sizes( end) + N−sum(sizes);
18 sizes = [0 cumsum(sizes)];
19
20 X = repmat(1,size(x));
21
22 loss = zeros(4,10);
23 for poly order = 1:4
24 % Augment x
25 X = [X x.ˆpoly order];
26 for k = 1:10 % 10−fold CV
27 % Extract the train and test data
28 traindata = X(order,:);
29 traint = t(order);
30 testdata = X(order(sizes(k)+1:sizes(k+1)),:);
31 testt = t(order(sizes(k)+1:sizes(k+1)));
32 traindata(sizes(k)+1:sizes(k+1),:) = [];
33 traint(sizes(k)+1:sizes(k+1)) = [];
34
35 % Fit the model
36 w = inv(traindata'*traindata)*traindata'*traint;
37
38 % Compute loss on test data
39 predictions = testdata*w;
40 loss(poly order,k) = sum((predictions − testt).ˆ2);
41 end
42 end
43
44 % Plot the loss
45 plot([1:4],mean(loss,2));

EX 1.10. The total loss is


N
X
L= (tn − wT xn )2 .
n=1

Writing this in matrix form, differentiating and solving gives us:

L = (t − Xw)T (t − Xw)
= tT t − 2wT XT t + wT XT Xw
∂L
= −2XT t + 2XT Xw = 0
∂w
w = (XT X)−1 XT t.

This is identical to the value obtained for the average loss. It is not surprising as
all we are doing is multiplying the loss by a constant and this will not change the
value of w at the minimum.
https://ebookyab.com/solution-manual-a-first-course-in-machine-learning-rogers-girolami/

EX 1.11. The loss is given by


N
1 X
L= αn (tn − wT xn )2
N n=1
If we define the matrix:
0 0
 
α1 ...
 0 α2 ... 0 
A= .. .. .. ..
 
. . . .

 
0 0 ... αN

we can write the loss in vector/matrix form as:


N
1 X
L= (t − Xw)T A(t − Xw)
N n=1

Multiplying out, differentiating, equating to zero and solving:


1 T
= t At − 2wT XT At + wT XT AXw

L
N
∂L 2 2
= − XT At + XT AXw = 0
∂w N N
w = (XT AX)−1 XT At.

Try this out in Matlab (set some αn very low and some very high) to see the effect
on the solution.

EX 1.12. Code below (regls100m.m):

1 clear all;close all;


2 load olympics;
3 % Extract men's 100m data
4 x = male100(:,1);
5 t = male100(:,2);
6
7 % Choose number of folds
8 K = 5;
9
10 % Randomise the data order
11 N = size(x,1);
12 order = randperm(N);
13 sizes = repmat(floor(N/K),1,K);
14 sizes( end) = sizes( end) + N−sum(sizes);
15 sizes = [0 cumsum(sizes)];
16
17 % Rescale x
18 x = x − x(1);
19 x = x./4;
20
21 X = [repmat(1,size(x)) x];
22 % Comment out the following line for linear
23 X = [X x.ˆ2 x.ˆ3 x.ˆ4];
https://ebookyab.com/solution-manual-a-first-course-in-machine-learning-rogers-girolami/

6 CHAPTER 1.

24
25 % Scan a wide range of values of the regularisation perameter
26 regvals = 10.ˆ[−12:1:12];
27
28 for r = 1:length(regvals)
29 for k = 1:K
30 % Extract the train and test data
31 traindata = X(order,:);
32 traint = t(order);
33 testdata = X(order(sizes(k)+1:sizes(k+1)),:);
34 testt = t(order(sizes(k)+1:sizes(k+1)));
35 traindata(sizes(k)+1:sizes(k+1),:) = [];
36 traint(sizes(k)+1:sizes(k+1)) = [];
37
38 % Fit the model
39 w = inv(traindata'*traindata + regvals(r)*eye(size(X,2)))*...
40 traindata'*traint;
41
42 % Compute loss on test data
43 predictions = testdata*w;
44 loss(r,k) = sum((predictions − testt).ˆ2);
45 end
46 end

You might also like