0% found this document useful (0 votes)

34 views5 pages

ADL Midterm Mock Exam 2021

Uploaded by

Vishal Ashok Palled

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views5 pages

ADL Midterm Mock Exam 2021

Uploaded by

Vishal Ashok Palled

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

E9 309 – Advance Deep Learning

Midterm Mock Question Paper

October 2021

Instructions

1. This exam is open book. However, computers, mobile phones and other handheld devices
are not allowed.

2. Any reference materials that are used in the exam (other than materials distributed in
the course webpage) should be pre-approved with the instructor before the exam.

3. No additional resources (other than those pre-approved) are allowed for use in the exam.

4. Academic integrity and ethics of highest order are expected.

5. Notation - bold symbols are vectors, capital bold symbols are matrices and regular symbols
are scalars.

6. Answer all questions.

7. Total Duration - 180 minutes including answer upload

8. Total Marks - 100 points

Name - ................................

Dept. - ....................

SR Number - ....................
1. The variational autoencoder model attempts to learn the parameters by maximizing the
data likelihood. Let, pD (x) denote the true underlying distribution of x. If the latent
representations z, with a prior distribution, p(z), is approximated using qφ (z|x) and the
reconstruction is modeled as pθ (x|z), then the training objective is:
EpD (x) [log pθ (x)] = EpD (x) [log Ep(z) [pθ (x|z)]]
whose lower bound is obtained as :
LELBO (x) = −DKL (qφ (z|x)||p(z)) + Eqφ (z|x) [log pθ (x|z)]]
q (z)
A modification to the training objective is to include Iq (x; z) = Eqφ (z,x) [log qφφ(z|x) ], which
represents the mutual information between the visible with a factor of α and to weight
the KL divergence term by a factor of λ. Thus, the new objective function is

−λDKL (qφ (z|x)||p(z)) + Eqφ (z|x) [log pθ (x|z)]] + αIq (x; z)

Show that the new objective can be simplified as

EpD (x) Eqφ (z|x) [log pθ (x|z)]]−(1−α) EpD (x) DKL (qφ (z|x)||p(z))−(α+λ−1)DKL (qφ (z)||p(z))
[25 marks]

2. t-SNE Let the joint probability of two vectors (xi , xj ) in the high dimensional space be
pij , and their corresponding vectors’ (yi , yj ) joint probability in the lower-dimensional
space, qij , be
yT Wyj
qij = P Pi T
l k6=l yk Wyl

The cost function of the model is KL Divergence between the two joint probability dis-
tributions, C = KL(P||Q). Now, what is the gradient of the cost function with respect
to yi ? (Let pii = qii = 0, and W be symmetric.) [20 marks]
Figure 1:

3. A question answering LSTM machine is shown in Figure above. To find answers of

questions in a multiple choice type problem, the model uses a LSTM-Attention Neural
architecture. Each question has four options which are either word, phrase, value or
sentence. The structure of the LSTM layer with attention framework is shown in Figure
1. Given a D-dimensional input sentence sequence embeddings X = x(1), ..., x(n) for each
questions and four options, we pass it to the LSTM layer. We obtain o = o(1), ..., o(n)
as the H-dimensional vector sequence of the answers. For each question , we obtain the
output embedding oq by average pooling the vectors. For options (answers) side, an
attention based embedding is generated as follows,

ma,q (t) = tanh(Wam oa (t) + Wqm oq )

T
sa,q (t) = sof tmax(wms [ma,q (1), ..., ma,q (n)])(t)
õa (t) = oa (t)sa,q (t)
where, Wam ∈ RHXH Wqm ∈ RHXH and wms ∈ RH are attention weights.Finally an
average pooling is done over all õa (t) to generate the answer embedding oa . Let oq , oap
and oan are the network outputs of a question input, it’s correct answer(ap) and incorrect
answer(an). We aim to maximize similarity between oq and oap and minimize similarity
between oq and oan using the triplet loss defined as:

L(oq , oap , oan ) = −oTq oap + oTq oan + α

where α is an arbitrary constant. How will you derive update equation for attention
weight Wam ∈ RHXH .
[20 marks]
(1) (T ) (1) (T )
4. RNN: Suppose we receive two binary sequences x1 = (x1 , · · · , x1 ) and x2 = (x2 , · · · , x2 )
of equal length, and we would like to design an RNN to determine if they are identical.
We will use the following (rather unusual) architecture, drawn with self-loops on the left
and unrolled on the right:

The computation in each step is as follows:

h(t) = φ(Wx(t) + b)
(
(t) φ(vT h(t) + ry (t−1) + c) for t > 1
y =
φ(vT h(t) + c0 ) for t = 1,

where φ denotes the hard threshold activation function

(
1 if z > 0
φ(z) =
0 if z ≤ 0

The parameters are a 2 x 2 weight matrix W, a 2-dimensional bias vector b, a 2- dimen-

sional weight vector v, a scalar recurrent weight r, a scalar bias c for all but the first time
step, and a separate bias c0 for the first time step.
We’ll use the following strategy. We’ll proceed one step at a time, and at time t, the
(t) (t)
binary-valued elements x1 and x2 will be fed as inputs. The output unit y (t) at time t
will compute whether all pairs of elements have matched up to time t. The two hidden
(t) (t)
units h1 and h2 will help determine if both inputs match at a given time step. Give
parameters which correctly implement this function: W, b, v, r, c, c0 . Hint: We can have
h1 (t) determine if both inputs are 0, and h2 (t) determine if both inputs are 1.
[15 marks]
5. Consider the Boltzmann machine shown here with three units: two visible and one hidden.
The units take on values 0, 1 and the weights between units are wv1 v2 = loge 3, wv1 h =
loge 2, wv2 h = loge 2.

(a) Write down, for each state (i.e., for all the combinations of settings of the units), the
expression for energy, the un-normalized probability and the normalized probability.
(b) Compute the probability that the visible units are in the state (v1 , v2 ) = (1, 1) when
the network is generating data freely (i.e., when the visible units are not clamped).
(c) If the network is being trained on a single data point where the visible units are in
the state (v1 , v2 ) = (1, 1), what is the derivative of the log probability of the data
with respect to wv2 h
[20 marks]

CS 236, Fall 2018 Midterm Exam: Stanford University Honor Code
100% (1)
CS 236, Fall 2018 Midterm Exam: Stanford University Honor Code
6 pages
1.deep Learning Assignment1 Solutions 1
100% (3)
1.deep Learning Assignment1 Solutions 1
12 pages
CS236 Homework 1
100% (1)
CS236 Homework 1
4 pages
Solution Dseclzg524 05-07-2020 Ec3r
No ratings yet
Solution Dseclzg524 05-07-2020 Ec3r
7 pages
MT1SP19
No ratings yet
MT1SP19
13 pages
Second Exam 2021-22
No ratings yet
Second Exam 2021-22
14 pages
Exam - Deep Learning - From Theory To Practice (201800177) - Jan 22 2019
No ratings yet
Exam - Deep Learning - From Theory To Practice (201800177) - Jan 22 2019
3 pages
2022 Resit Solution
No ratings yet
2022 Resit Solution
12 pages
Exam Long Questions
No ratings yet
Exam Long Questions
8 pages
Mock Endterm ADL 2021
No ratings yet
Mock Endterm ADL 2021
8 pages
III B. Tech II Sem (R21) CSE-AI - DL - Model Question Paper Set-1 Scheme
No ratings yet
III B. Tech II Sem (R21) CSE-AI - DL - Model Question Paper Set-1 Scheme
4 pages
Sol 4
No ratings yet
Sol 4
7 pages
Comprehensive Exam - Answer Key - DNN - EC3M - October 2024
No ratings yet
Comprehensive Exam - Answer Key - DNN - EC3M - October 2024
7 pages
Solution: Introduction To Deep Learning
No ratings yet
Solution: Introduction To Deep Learning
20 pages
DNN Cluster S2 22 MidSem Makeup
No ratings yet
DNN Cluster S2 22 MidSem Makeup
7 pages
hw1 f21112 Problems11
No ratings yet
hw1 f21112 Problems11
2 pages
ML Endsem 2022
No ratings yet
ML Endsem 2022
7 pages
DL Exam 2023-2
No ratings yet
DL Exam 2023-2
5 pages
7COM1033test 0000
No ratings yet
7COM1033test 0000
4 pages
DL Quiz1
No ratings yet
DL Quiz1
5 pages
2022 Exam2 Solution
No ratings yet
2022 Exam2 Solution
10 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
SS 2020
No ratings yet
SS 2020
21 pages
Mid Sem
No ratings yet
Mid Sem
2 pages
Assignment 5 Solution
No ratings yet
Assignment 5 Solution
4 pages
DNN Cluster S2 22 MidSem Regular
No ratings yet
DNN Cluster S2 22 MidSem Regular
6 pages
General Notes: Heruntergeladen Durch Petre Weinberger (Extern - Weinberger@tum - De)
No ratings yet
General Notes: Heruntergeladen Durch Petre Weinberger (Extern - Weinberger@tum - De)
6 pages
DL Midterm Rubrics
No ratings yet
DL Midterm Rubrics
5 pages
Solution PDF
No ratings yet
Solution PDF
20 pages
SS 2020 Solutions
No ratings yet
SS 2020 Solutions
22 pages
Deep Learning Lab Manual
No ratings yet
Deep Learning Lab Manual
65 pages
EE 769 2018 End-Sem
No ratings yet
EE 769 2018 End-Sem
2 pages
DL QB
No ratings yet
DL QB
4 pages
Kinetic Theory of Gases Notes
No ratings yet
Kinetic Theory of Gases Notes
5 pages
Challenging Questions
No ratings yet
Challenging Questions
2 pages
Cs224n Midterm 2018 Solution
No ratings yet
Cs224n Midterm 2018 Solution
17 pages
10 Exercises RNN MUD SOLVED
No ratings yet
10 Exercises RNN MUD SOLVED
4 pages
CST414 Scheme
No ratings yet
CST414 Scheme
8 pages
Week 11 Exercises Solutions
No ratings yet
Week 11 Exercises Solutions
6 pages
Week 11 Nptel Deep Learning
No ratings yet
Week 11 Nptel Deep Learning
6 pages
Deep Learning Question Bank
No ratings yet
Deep Learning Question Bank
8 pages
WS 2021 Solutions
No ratings yet
WS 2021 Solutions
16 pages
RNN LSTM
No ratings yet
RNN LSTM
71 pages
2022-23 Odd Et Cse DLNLP
No ratings yet
2022-23 Odd Et Cse DLNLP
4 pages
APS360H1 20231 631682452284APS360 Midterm Winter 2023
No ratings yet
APS360H1 20231 631682452284APS360 Midterm Winter 2023
16 pages
Deep Learning Sem 5
No ratings yet
Deep Learning Sem 5
3 pages
DL - Midterm - Fall23
No ratings yet
DL - Midterm - Fall23
2 pages
Practical 10 Solution
No ratings yet
Practical 10 Solution
6 pages
Deep Learning
No ratings yet
Deep Learning
9 pages
Must Know Questions Deep Learning
No ratings yet
Must Know Questions Deep Learning
22 pages
Mlgs 2021 Retake
No ratings yet
Mlgs 2021 Retake
54 pages
SS 2021 Solutions
No ratings yet
SS 2021 Solutions
16 pages
Ad3501 Deep-Learning Model Question
No ratings yet
Ad3501 Deep-Learning Model Question
16 pages
Question Bank Advanced CO3, CO4
No ratings yet
Question Bank Advanced CO3, CO4
5 pages
MT1 SP19 Solutions
No ratings yet
MT1 SP19 Solutions
14 pages
Final 2015 W
No ratings yet
Final 2015 W
4 pages
M3 L4 RNN Regularization
No ratings yet
M3 L4 RNN Regularization
24 pages

ADL Midterm Mock Exam 2021

Uploaded by

ADL Midterm Mock Exam 2021

Uploaded by

E9 309 – Advance Deep Learning

Midterm Mock Question Paper

4. Academic integrity and ethics of highest order are expected.

6. Answer all questions.

7. Total Duration - 180 minutes including answer upload

8. Total Marks - 100 points

−λDKL (qφ (z|x)||p(z)) + Eqφ (z|x) [log pθ (x|z)]] + αIq (x; z)

Show that the new objective can be simplified as

3. A question answering LSTM machine is shown in Figure above. To find answers of

ma,q (t) = tanh(Wam oa (t) + Wqm oq )

L(oq , oap , oan ) = −oTq oap + oTq oan + α

The computation in each step is as follows:

where φ denotes the hard threshold activation function

The parameters are a 2 x 2 weight matrix W, a 2-dimensional bias vector b, a 2- dimen-

You might also like