DEEP LEARNING WEEK 10
1. Which of the following architectures has the highest no of layers?
a)AlexNet
b)GoogleNet
c)VGG
d)ResNet
Answer: d)
Solution: ResNet has the highest no of layers among all other architectures
2. Consider a convolution operation with an input image of size 100x100x3 and a filter of size
8x8x3, using a stride of 1 and a padding of 1. What is the output size?
A. 100x100x3
B. 98x98x1
C. 102x102x3
D. 95x95x1
Answer: d)
Solution: Output size = (Input size - Filter size + 2Padding)/Stride + 1 Here, Input size =
100x100x3, Filter size = 7x7x3, Padding = 1, Stride = 1 Output size = (100 - 8 + 2)/1 + 1
= 95 Therefore, the output size is 95x95x1. Hence, the correct answer is option D.
3. Consider a convolution operation with an input image of size 256x256x3 and 40 filters of size
11x11x3, using a stride of 4 and a padding of 2. What is the height of the output size?
A. 63
B. 64
C. 40
D. 3
Answer: C
Solution: The height of the image is equal to the number of filters.
4. Which statement is true about the number of filters in CNNs?
a) More filters lead to better accuracy.
b) Fewer filters lead to better accuracy.
c) The number of filters has no effect on accuracy.
d) The number of filters only affects the computation time.
Answer: a) More filters lead to better accuracy.
Solution: More filters can lead to better accuracy because they allow the network to learn
more complex and diverse features. However, increasing the number of filters also increases
the number of parameters in the network.
5. Which of the following statements is true regarding the occlusion experiment in a CNN?
A. It is used to determine the importance of each feature map in the output of the network.
B. It involves masking a portion of the input image with a patch of zeroes.
C. It is a technique used to prevent overfitting in deep learning models.
D. It is used to increase the number of filters in a convolutional layer.
Answer: A B
1
Solution: In the occlusion experiment, a patch of zeroes is placed over a portion of the
input image to observe the effect on the output of the network. This helps to determine the
importance of each region of the image in the network’s prediction.
6. Which of the following is an innovation introduced in GoogleNet architecture?
a) 1x1 convolutions to reduce the dimension
b) ReLU activation function
c) Dropout regularization
d) use of different-sized filters for the same input
Correct Answer: a),d)
Solution: GoogleNet introduced an inception module that consists of 1x1 convolutions to
reduce the dimension of the input image and then use different-sized filters for the same
reduced input to get different feature maps before concatenating them and sending them to
the further layers.
7. What is the purpose of guided backpropagation in CNNs?
a) To visualize which pixels in an image are most important for a particular class prediction.
b) To train the CNN to improve its accuracy on a given task.
c) To reduce the size of the input images in order to speed up computation.
d) None of the above.
Answer: a)
Explanation: Guided backpropagation is a technique used to visualize the parts of an input
image that are most important for a particular class prediction. It achieves this by
backpropagating the gradients of the output class with respect to the input image, but only
allowing positive gradients to flow through the network.
8. Which layer in a CNN is used for guided backpropagation?
a) Input layer
(b) Convolutional layer
(c) Activation layer
(d) Pooling layer
Answer: (c)
Explanation: Guided backpropagation is typically applied to the activation layers in a
CNN since these layers contain the most relevant information about which parts of the input
image are contributing to the output.
9. Which of the following is a technique used to fool CNNs in Deep Learning?
a) Adversarial examples
b) Transfer learning
c) Dropout
d) Batch normalization
Answer: a) Adversarial examples
Solution: Adversarial examples are images that have been specifically designed to trick a
CNN into misclassifying them. They are created by making small, imperceptible changes to
an image that cause the CNN to output the wrong classification.
2
10. We have a trained CNN. We have the picture on the left which when fed into the network as
input is given the label ’HUMAN’ with high probability. The picture on the right is the same
image with some added noise. If we feed the right image as input to the CNN then which of
the following statements is True?
Left Image Right Image
a)CNN will detect the image as ‘HUMAN’
b)CNN will not detect the image as ‘HUMAN’ since noise is added to the image.
c)CNN will detect the image as ‘HUMAN’ but with a lower probability than the left image.
d)Insufficient information to say anything
Answer: d)
Solution: CNN may detect this image as ‘HUMAN’ or ‘NOT HUMAN’ depending upon the
decision boundary it has learned. We can’t say what will happen since the addition of noise
may push the image out of the decision boundary, or maybe push it more inside which
increases the probability score given by CNN to the image.