0% found this document useful (0 votes)

22 views7 pages

Sol 4

The document contains exercises from an Introduction to Machine Learning course at ETH Zürich, focusing on artificial neural networks, particularly recurrent neural networks (RNNs) and their expressiveness. It includes problems related to the implementation of logical functions using neural networks and the derivation of loss functions for RNNs. Additionally, it discusses the forward propagation in neural networks and the application of stochastic gradient descent for optimizing weights.

Uploaded by

mussamut.maliha1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views7 pages

Sol 4

Uploaded by

mussamut.maliha1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Exercises Institute for Machine Learning

Introduction to Machine Learning Dept. of Computer Science, ETH Zürich

FS 2018
Prof. Dr. Andreas Krause
Web: https://las.inf.ethz.ch/teaching/introml-s18
Series 4, Apr 10th, 2018 Email questions to:
Mojmir Mutny, [email protected]
(ANNs)

Problem 1 (Recurrent Neural Networks):

In the lecture so far, we saw feedforward artificial neural networks, which do not contain any cycles and for which
the nodes do not maintain a persistent state over several runs. This exercise considers artificial neural networks
with nodes that maintain a persistent state that can be updated. This kind of neural network is called a recurrent
neural networks (RNN). As an example, consider the following RNN with

yt = W xt + V st
st+1 = yt

from some initial state s0 , where t denotes the tth call of the RNN, i.e., xt is the tth input.

Figure 1

(a) What is the recurrent state in the RNN from Figure 1? Name one example that can be more naturally
modeled with RNNs than with feedforward neural networks?
(b) As the state of an RNN changes over different runs of the RNN, the loss functions that we use for feedforward
neural networks do not yield consistent results. For given dataset X, please propose a loss function ( based
on the mean square loss function) for RNNs and justify why you chose this loss function.
(c) For a dataset X := (xt , yt )k1 (for some k ∈ N), show how information is propagated by drawing a feedforward
neural network that corresponds to the RNN from Figure 1 for k = 3. Recall that a feedforward neural
network does not contain nodes with a persistent state. (Hint: unfold the RNN.)

Solution 1:

(a) The reccurent state is denoted s. In this case it conincides with the output. Reccurent models are used to
model data with temporal structure e.g. time series, speech, sound.
(b) We have a data X = {(xt , yt )}, where we assume that the data is ordered temporily. Thus, we define the
PT
loss function to be L(U, W, s0 ) = t=1 (y(t) − f (xt , st−1 (U, W ), U, W ), where st is the previous reccurent
state. The initial state s0 needs to be specified and the problem depends on it as well.

(c) Check Figure 2.

Figure 2

2
Problem 2 (Expressiveness of Neural Networks):
In this question we will consider neural networks with sigmoid activation functions of the form
1
ϕ(z) = .
1 + exp(−z)
If we denote by vjl the value of neuron j at layer l its value is computed as
 
X
vjl = ϕ w0 + wj,i vil−1  .
i∈Layerl−1

In the following questions you will have to design neural networks that compute functions of two Boolean inputs
X1 and X2 . Given that the outputs of the sigmoid units are real numbers Y ∈ (0, 1), we will treat the final
output as Boolean by considering it as 1 if greater than 0.5 and 0 otherwise.

(a) Give 3 weights w0 , w1 , w2 for a single unit with two inputs X1 and X2 that implements the logical OR
function Y = X1 ∨ X2 .
(b) Can you implement the logical AND function Y = X1 ∧ X2 using a single unit? If so, give weights that
achieve this. If not, explain the problem.
(c) It is impossible to implement the XOR function Y = X1 ⊕ X2 using a single unit. However, you can do it
using a multi-layer neural network. Use the smallest number of units you can to implement XOR function.
Draw your network and show all the weights.
(d) Create a neural network with only one hidden layer (of any number of units) that implements
(A ∨ ¬B) ⊕ (¬C ∨ ¬D).
Draw your network and show all the weights.

Solution 2:

(a) We consider the following network w = 0.5, w1 = 1 and w2 = 1. We check whether the output we get is
desired OR function. The network looks as follows,
A ∨ B = round(ϕ(w0 + w1 A + w2 B))
A ∨ B = round(ϕ(−0.5 + A + B))
A B A∨B Network Round
1 1 1 ≈ 0.81 1
0 1 1 ≈ 0.62 1
1 0 1 ≈ 0.62 1
0 0 0 ≈ 0.37 0
(b) w0 = −1.5, w1 = 1 and w2 = 1. We check whether the output we get is desired AND function. The
network looks as follows,
A ∧ B = round(ϕ(w0 + w1 A + w2 B))
A ∧ B = round(ϕ(−1.5 + A + B))
A B A∧B Network Round
1 1 1 ≈ 0.62 1
0 1 0 ≈ 0.37 0
1 0 0 ≈ 0.37 0
0 0 0 ≈ 0.18 0

3
(c) We find the weights by choosing the weights of the first layer and optimizing over the weights of the last
layer s.t. the inequalities are satisfied.
We use a network with one hidden layer and two states

A ⊕ B = round(ϕ(−ϕ(A + B) + 0.84ϕ(2A + 2B))

A B A⊕B Network Round

0 0 0 0.480010659844 0
0 1 1 0.502202727467 1
1 0 1 0.502202727467 1
1 1 0 0.486027265451 0

Figure 3

For sketch check Figure 3.

(d) There are multiple ways to proceed to solve this exercises. One universal way is to guess, however this might
become too complicated for large expressions. Another approach is to use a numerical software to fit the
network to match the logic. However, this might be too brute force approach for these general problems.
We provide here a third principled alternative to build such neural networks. I am not certain whether this
method generalizes to any logical expressions (if you show it; good for you), however it works for this case.
Namely, we choose to transform the expression to disjunctive normal form, and then formulate a simplified
classifier. Recall from your previous courses that any boolean algebra expression can be converted to a
disjunctive normal form. In this form, the expression is cumulative OR of several expressions that contain
just AND inside them. As AND, and OR operations can be modeled easily with our neural network, we can
hope to perform the AND operation in one layer, and the OR operation in the second layer.
Specifically, after applying de Morgan laws, distributivity and symmetry to our expression many times, we
arrive at the following minimal form,

(A ∨ ¬B) ⊕ (¬C ∨ ¬D) ⇐⇒ (A ∧ C ∧ D) ∨ (¬A ∧ B ∧ ¬C) ∨ (¬A ∧ B ∧ ¬D) ∨ (¬B ∧ C ∧ D). (1)

The detailed derivation works as follows.

(A ∨ ¬B) ⊕ (¬C ∨ ¬D)

⇐⇒ ((A ∨ ¬B) ∧ (C ∧ D)) ∨ ((¬A ∧ B) ∧ (¬C ∨ ¬D))

applying the distributivity law ((X ∨ Y ) ∧ Z ⇐⇒ ((X ∧ Z) ∨ (Y ∧ Z)), we get

⇐⇒ ((A ∧ C ∧ D) ∨ (¬B ∧ C ∧ D)) ∨ ((¬A ∧ B ∧ ¬C) ∨ (¬A ∧ B ∧ ¬D))

applying associativity and symmetry, we get

⇐⇒ (A ∧ C ∧ D) ∨ (¬A ∧ B ∧ ¬C) ∨ (¬A ∧ B ∧ ¬D) ∨ (¬B ∧ C ∧ D).

4
Our expression decomposes to 4 expressions that are combined via logical ORs. Thus, we propose to choose
4 cells each modeling the respective expression in OR stream (AND here), and in the last cell we take OR
of all of them. Implementation of AND can be done simply using the intuition from previous exercises.

1. (A ∧ C ∧ D) : implements h1 = ϕ(η(A + C + D − 2.5))

2. (¬A ∧ B ∧ ¬C) : implements h2 = ϕ(η(−A + B − C − 0.4))
3. (¬A ∧ B ∧ ¬D) : implements h3 = ϕ(η(−A + B − D − 0.4))
4. (¬B ∧ C ∧ D) : implements h4 = ϕ(η(−B + C + D − 1.4))

In order to get output 0 or 1, we take η →large, e.g.η = 200.

Subsequently to define OR, we require that at least one of these is activated, thus final layer becomes easy,

H(h1, h2, h3, h4) = ϕ(h1 + h2 + h3 + h4 − 0.5).

In full,

output(A, B, C, D) = H(h1(A, C, D), h2(A, B, C), h3(A, B, D), h4(B, C, D)). (2)

We check the truth table,

A B C D (A ∨ ¬B) ⊕ (¬C ∨ ¬D) Round
0 0 0 0 0 0
0 0 0 1 0 0
0 0 1 0 0 0
0 0 1 1 1 1
0 1 0 0 1 1
0 1 0 1 1 1
0 1 1 0 1 1
0 1 1 1 0 0
1 0 0 0 0 0
1 0 0 1 0 0
1 0 1 0 0 0
1 0 1 1 1 1
1 1 0 0 0 0
1 1 0 1 0 0
1 1 1 0 0 0
1 1 1 1 1 1
For a figure see Figure 4.

5
Figure 4

Problem 3 (Exam question: Artificial Neural Networks):

Consider the following neural network with two logistic hidden units h1 , h2 , and three inputs x1 , x2 , x3 . The
output neuron f is a linear unit, and we are using the squared error cost function
E = (y − f )2 . The logistic function is defined as ρ(x) = 1/ (1 + e−x ).
[Note: You can solve part (c) without using the solution for part (b).]

(a) Consider a single training example x = [x1 , x2 , x3 ] with target output (label) y. Write down the sequence
of calculations required to compute the squared error cost (called forward propagation).
(b) A way to reduce the number of parameters to avoid overfitting is to tie certain weights together, so that
they share a parameter. Suppose we decide to tie the weights w1 and w4 , so that w1 = w4 = wtied . What
is the derivative of the error E with respect to wtied , i.e. ∇wtied E?
(c) For a data set D = {(x(1) , y (1) ), · · · , (x(n) , y (n) )} consisting of n labeled examples, write the pseudocode
of the stochastic gradient descent algorithm with learning rate ηt for optimizing the weight wtied (assume
all the other parameters are fixed).

Solution 3:

6
Past Exam question,Detailed solution not provided

(a) basic algebra defined via the DAG

(b) apply chain rule
0
(c) 1. Pick a random guess wtied , learning rate ηk
2. Repeat with k = 1 . . . as iteration index
(a) Sample a data point j ∼ Unif[n]
k+1 k k
(b) wtied = wtied − ηk ∇wtied E(wtied )
3. Until happy with solution

Haykin, Xue-Neural Networks and Learning Machines 3ed Soln
53% (19)
Haykin, Xue-Neural Networks and Learning Machines 3ed Soln
103 pages
First
No ratings yet
First
92 pages
Module 2 Deep Feed Forward Networks
No ratings yet
Module 2 Deep Feed Forward Networks
18 pages
Must Know Questions Deep Learning
No ratings yet
Must Know Questions Deep Learning
22 pages
1 BasicFramework
No ratings yet
1 BasicFramework
52 pages
03 Back Propagation Network
No ratings yet
03 Back Propagation Network
33 pages
Chapter 1 Neurophysiology and Learning Algorithms
No ratings yet
Chapter 1 Neurophysiology and Learning Algorithms
62 pages
Injecting Logical Constraints Into Neural Networks Via Straight Through Estimators
No ratings yet
Injecting Logical Constraints Into Neural Networks Via Straight Through Estimators
27 pages
ML Unit-5
No ratings yet
ML Unit-5
20 pages
Module 2
No ratings yet
Module 2
44 pages
Lab 5: 16 April 2012 Exercises On Neural Networks
No ratings yet
Lab 5: 16 April 2012 Exercises On Neural Networks
6 pages
DL M1 Tech
No ratings yet
DL M1 Tech
40 pages
Haykin Xue Neural Networks and Learning Machines 3ed Soln PDF
50% (2)
Haykin Xue Neural Networks and Learning Machines 3ed Soln PDF
103 pages
Lecture Slides 2 - Neural Networks - 2021
No ratings yet
Lecture Slides 2 - Neural Networks - 2021
42 pages
Learning Algorithms Via Neural Logic Net
No ratings yet
Learning Algorithms Via Neural Logic Net
10 pages
Solution Manual Neural Networks and Lear
No ratings yet
Solution Manual Neural Networks and Lear
5 pages
7 NN Apr 28 2021
No ratings yet
7 NN Apr 28 2021
81 pages
M3 L4 RNN Regularization
No ratings yet
M3 L4 RNN Regularization
24 pages
Week 4
No ratings yet
Week 4
61 pages
SS 2020 Solutions
No ratings yet
SS 2020 Solutions
22 pages
TT1 QBAns1
No ratings yet
TT1 QBAns1
15 pages
Exercise Sheet 8
No ratings yet
Exercise Sheet 8
5 pages
Y W X y F (X) X W X: Neural Networks Viewed As Directed Graph
No ratings yet
Y W X y F (X) X W X: Neural Networks Viewed As Directed Graph
12 pages
Lecture20 Backprop
No ratings yet
Lecture20 Backprop
77 pages
КНиНД демо ENG
No ratings yet
КНиНД демо ENG
16 pages
Unit I - Artificial Neural Network - Assignment 1
No ratings yet
Unit I - Artificial Neural Network - Assignment 1
4 pages
Name:-Time Allowed: - 3 Hours: Artificial Neural Networks Exam
No ratings yet
Name:-Time Allowed: - 3 Hours: Artificial Neural Networks Exam
11 pages
Lab 04
No ratings yet
Lab 04
4 pages
Solution ToYegnRame2001
No ratings yet
Solution ToYegnRame2001
107 pages
Solution Manual Neural Networks and Lear
No ratings yet
Solution Manual Neural Networks and Lear
5 pages
Ausonius Grammaticus The Christening of Philology in The Late Roman West Lionel Yaceczko PDF Download
No ratings yet
Ausonius Grammaticus The Christening of Philology in The Late Roman West Lionel Yaceczko PDF Download
85 pages
AND Operation Using NN
No ratings yet
AND Operation Using NN
53 pages
DLAI4 Networks Recurrent
No ratings yet
DLAI4 Networks Recurrent
7 pages
DNN Cluster S2 22 MidSem Makeup
No ratings yet
DNN Cluster S2 22 MidSem Makeup
7 pages
DL 02 Deep Forward Networks
No ratings yet
DL 02 Deep Forward Networks
47 pages
HW04
No ratings yet
HW04
9 pages
1.deep Learning Assignment1 Solutions 1
100% (3)
1.deep Learning Assignment1 Solutions 1
12 pages
Mock Endterm ADL 2021
No ratings yet
Mock Endterm ADL 2021
8 pages
Exam - Deep Learning - From Theory To Practice (201800177) - Jan 22 2019
No ratings yet
Exam - Deep Learning - From Theory To Practice (201800177) - Jan 22 2019
3 pages
Chapter 7
No ratings yet
Chapter 7
31 pages
ADL Midterm Mock Exam 2021
No ratings yet
ADL Midterm Mock Exam 2021
5 pages
ANFIS Nerual Network Matlab
No ratings yet
ANFIS Nerual Network Matlab
14 pages
Religion in The Emergence of Civilization Çatalhöyük As A Case Study 1st Edition Ian Hodder Instant Download
No ratings yet
Religion in The Emergence of Civilization Çatalhöyük As A Case Study 1st Edition Ian Hodder Instant Download
60 pages
NN PDF
No ratings yet
NN PDF
23 pages
2360 PDF C04
No ratings yet
2360 PDF C04
18 pages
Unit 2.1
No ratings yet
Unit 2.1
37 pages
Quiz Feedback2 - Coursera
0% (1)
Quiz Feedback2 - Coursera
5 pages
Conventus 3.0 Brochure (Revised) (1) - Compressed
No ratings yet
Conventus 3.0 Brochure (Revised) (1) - Compressed
18 pages
Permutation and Combination
No ratings yet
Permutation and Combination
93 pages
Reflecting The World - A Guide To Incorporating Equity in Mathematics Teacher Education
No ratings yet
Reflecting The World - A Guide To Incorporating Equity in Mathematics Teacher Education
138 pages
Dokumen - Pub - Natural Language Processing Practical Using Transformers With Python
No ratings yet
Dokumen - Pub - Natural Language Processing Practical Using Transformers With Python
275 pages
Back Propagation Network: Soft Computing
No ratings yet
Back Propagation Network: Soft Computing
33 pages
Hw3 Solutions
No ratings yet
Hw3 Solutions
7 pages
DL Quiz1
No ratings yet
DL Quiz1
5 pages
Questions 11: Feed-Forward Neural Networks: Roman Belavkin Middlesex University
No ratings yet
Questions 11: Feed-Forward Neural Networks: Roman Belavkin Middlesex University
7 pages
ML Cie-2
No ratings yet
ML Cie-2
2 pages
Lab 04 Sol PDF
No ratings yet
Lab 04 Sol PDF
7 pages
DL Exam 2023-2
No ratings yet
DL Exam 2023-2
5 pages
Challenging Questions
No ratings yet
Challenging Questions
2 pages
The Book of Life (Upton Sinclair)
100% (3)
The Book of Life (Upton Sinclair)
322 pages
Sim1 Ans
No ratings yet
Sim1 Ans
240 pages
CS 540: Introduction To Artificial Intelligence: Final Exam: 8:15-9:45am, December 21, 2016 132 Noland
No ratings yet
CS 540: Introduction To Artificial Intelligence: Final Exam: 8:15-9:45am, December 21, 2016 132 Noland
8 pages
Philippines Construction Labor Rates Position Manila Rate Per Day Cost Range Cost Range Item No. Provincial Rate Per Day
No ratings yet
Philippines Construction Labor Rates Position Manila Rate Per Day Cost Range Cost Range Item No. Provincial Rate Per Day
5 pages
Special Schools 202324 Second Tranche
No ratings yet
Special Schools 202324 Second Tranche
9 pages
TK Weekly Action Plan and Report Semester 2
No ratings yet
TK Weekly Action Plan and Report Semester 2
2 pages
SOP (Standard Operating Procedure)
No ratings yet
SOP (Standard Operating Procedure)
10 pages
The 20 Techniques of Formative Assessment
No ratings yet
The 20 Techniques of Formative Assessment
18 pages
DARE Diabeticon TSMC
No ratings yet
DARE Diabeticon TSMC
6 pages
Procedures of Audio
No ratings yet
Procedures of Audio
2 pages
National Policy On School Health-English
No ratings yet
National Policy On School Health-English
15 pages
Rahman, 2019
No ratings yet
Rahman, 2019
6 pages
DLL Stat 4th Week 1day 1
No ratings yet
DLL Stat 4th Week 1day 1
8 pages
Sponsorship Proposal
100% (1)
Sponsorship Proposal
8 pages
B1 - Final Test - Writting Mock Test
No ratings yet
B1 - Final Test - Writting Mock Test
2 pages
OSSLT BKLT May 2013
No ratings yet
OSSLT BKLT May 2013
18 pages
Proposed A Two-Storey Residential Building: I. Miscelaneous
No ratings yet
Proposed A Two-Storey Residential Building: I. Miscelaneous
2 pages
Ana Aslan Proiect Engleza
No ratings yet
Ana Aslan Proiect Engleza
2 pages
Pud Unit 1 - 4 A 6to Grado
No ratings yet
Pud Unit 1 - 4 A 6to Grado
6 pages
Cadaver Reaction Paper
100% (1)
Cadaver Reaction Paper
1 page
NEWSLETTER
No ratings yet
NEWSLETTER
2 pages
s7 Symbol Table Data Type
No ratings yet
s7 Symbol Table Data Type
4 pages
Position Paper About Eagle Mentality
No ratings yet
Position Paper About Eagle Mentality
3 pages
Sample Lesson Plan 1-1
No ratings yet
Sample Lesson Plan 1-1
2 pages
Rules& Regulations: Entry Level Integrated Training and Enablement (ELITE)
No ratings yet
Rules& Regulations: Entry Level Integrated Training and Enablement (ELITE)
4 pages
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Calculus-II (Mathematics) Question Bank
From Everand
Calculus-II (Mathematics) Question Bank
Mohmmad Khaja Shareef
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)

Sol 4

Uploaded by

Sol 4

Uploaded by

Exercises Institute for Machine Learning

Introduction to Machine Learning Dept. of Computer Science, ETH Zürich

Problem 1 (Recurrent Neural Networks):

(c) Check Figure 2.

A ⊕ B = round(ϕ(−ϕ(A + B) + 0.84ϕ(2A + 2B))

A B A⊕B Network Round

For sketch check Figure 3.

The detailed derivation works as follows.

(A ∨ ¬B) ⊕ (¬C ∨ ¬D)

applying the distributivity law ((X ∨ Y ) ∧ Z ⇐⇒ ((X ∧ Z) ∨ (Y ∧ Z)), we get

⇐⇒ ((A ∧ C ∧ D) ∨ (¬B ∧ C ∧ D)) ∨ ((¬A ∧ B ∧ ¬C) ∨ (¬A ∧ B ∧ ¬D))

applying associativity and symmetry, we get

⇐⇒ (A ∧ C ∧ D) ∨ (¬A ∧ B ∧ ¬C) ∨ (¬A ∧ B ∧ ¬D) ∨ (¬B ∧ C ∧ D).

1. (A ∧ C ∧ D) : implements h1 = ϕ(η(A + C + D − 2.5))

In order to get output 0 or 1, we take η →large, e.g.η = 200.

H(h1, h2, h3, h4) = ϕ(h1 + h2 + h3 + h4 − 0.5).

We check the truth table,

Problem 3 (Exam question: Artificial Neural Networks):

(a) basic algebra defined via the DAG

You might also like