Week 3
Week 3
1. How many parameters (including biases) are there in the entire network?
Correct Answer: 2274
Solution:
Number of Parameters
Input Layer to h1 : 200 × 10 + 10 = 2010
h1 to h2 : 10 × 10 + 10 = 110
h2 to h3 : 10 × 10 + 10 = 110
h3 to Output Layer: 10 × 4 + 4 = 44
Total Parameters: 2010 + 110 + 110 + 44 = 2274
2. Suppose all elements in the input vector are zero, and the corresponding true label is
also 0. Further, suppose that all the parameters (weights and biases) are initialized
to zero. What is the loss value if the cross-entropy loss function is used? Use the
natural logarithm (ln).
Correct Answer: Range(1.317,1.455)
Solution:
Loss with Zero Inputs and Parameters Input: x = 0, weights and biases = 0.
Hidden Layers: σ(0) = 0.5.
Output Layer Logits: [0, 0, 0, 0].
Softmax: Softmax(zi ) = 14 , ∀i.
Cross-Entropy Loss: − ln 14 = ln(4) ≈ 1.386.
a1 h(1)
1
(1)
h2 Hidden layer 2
(1) a2 (2)
Input layer h3 h1
(1) (2)
h7 h5
(1)
W1 h8 W2 W3
(1)
h9
In the diagram, W1 is a matrix and x, a1 , h1 , and O are all column vectors. The
notation Wi [j, :] denotes the j th row of the matrix Wi , Wi [:, j] denotes the j th column
of the matrix Wi and Wkij denotes an element at ith row and j th column of the matrix
Wk .
(a) W1 ∈ R3×9
(b) a1 ∈ R9×5
(c) W1 ∈ R9×3
(d) a1 ∈ R1×9
(e) W1 ∈ R1×9
(f) a1 ∈ R9×1
(a) Logistic
(b) Step function
(c) Softmax
(d) linear
7. Given two probability distributions p and q, under what conditions is the cross entropy
between them minimized?
8. Given that the probability of Event A occurring is 0.18 and the probability of Event
B occurring is 0.92, which of the following statements is correct?
The following network doesn’t contain any biases and the weights of the network are
given below:
1 1 3
1 1 2
W1 =2 −1 1 W2 = W3 = 1 2
3 1 1
1 2 −2
1
The input to the network is: x = 2
1
The target value y is: y = 5
9. What is the predicted output for the given input x after doing the forward pass?
Correct Answer: Range(2.9,3.0)
Solution:
Doing the forward
pass in thenetwork
we
get
1 1 3 1 6
h1 = W1 · x1 = 2 −1 1 · 2 = 1
1 2 −2 1 3
0.997
a1 = sigmoid(h1 ) =0.731
0.952
0.997
1 1 2 3.632
h2 = W2 · a 1 = . 0.731 =
3 1 1 4.674
0.952
0.974
a2 = sigmoid(h2 ) =
0.990
0.974
y= 1 2 · = 2.954
0.990
10. Compute and enter the loss between the output generated by input x and the true
output y.
Correct Answer: Range(3.97,4.39)
Solution: Loss=(5 − 2.954)2 = 4.1861