0% found this document useful (0 votes)
744 views26 pages

CNN Mcqs

The document consists of multiple-choice questions related to convolutional neural networks, covering topics such as convolution layers, downsampling, upsampling, and various architectures like AlexNet and VGG. Key concepts include the functions of convolution layers, the purpose of downsampling and upsampling, and the characteristics of different neural network architectures. The questions assess knowledge on the advantages of techniques like dilated convolutions, the impact of padding, and the role of data augmentation in training models.

Uploaded by

abdullahtest1111
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
744 views26 pages

CNN Mcqs

The document consists of multiple-choice questions related to convolutional neural networks, covering topics such as convolution layers, downsampling, upsampling, and various architectures like AlexNet and VGG. Key concepts include the functions of convolution layers, the purpose of downsampling and upsampling, and the characteristics of different neural network architectures. The questions assess knowledge on the advantages of techniques like dilated convolutions, the impact of padding, and the role of data augmentation in training models.

Uploaded by

abdullahtest1111
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Multiple Choice Questions (MCQs) Based on Lecture 15-18 Slides

7.1 Convolution Layers

What is the primary function of a convolution layer in a neural network?


a) To perform matrix multiplication
b) To apply filters to input data to extract features
c) To normalize the output data
d) To reduce the number of parameters
Answer: b

In a convolutional layer, how are weights shared?


a) Across all layers
b) Within the same filter across the input
c) Only in the output channel
d) Not shared at all
Answer: b

7.2 Downsampling

What is the main purpose of downsampling in convolutional neural networks?


a) To increase the number of features
b) To reduce spatial dimensions while retaining important information
c) To add more layers to the network
d) To increase the receptive field without filters
Answer: b

Which technique is commonly used for downsampling?


a) Nearest neighbor interpolation
b) Pooling
c) Upsampling
d) Fully connected layers
Answer: b

7.3 Upsampling

What is the purpose of upsampling in neural networks?


a) To reduce the image size
b) To increase the spatial resolution of the feature maps
c) To apply convolution filters
d) To perform classification
Answer: b

Page 1 of 26
Which method is mentioned for scaling channels during upsampling?
a) Bilinear interpolation
b) Nearest neighbor interpolation
c) Gaussian blur
d) Median filtering
Answer: b

7.4 Architectures

How many layers does AlexNet, introduced in 2012, consist of?


a) 5 layers
b) 8 layers
c) 16 layers
d) 152 layers
Answer: b

What technique was used in AlexNet to prevent overfitting?


a) Batch normalization
b) Dropout
c) Weight decay
d) Data augmentation
Answer: b

How many parameters does Inception (GoogleNet) have compared to VGG-16?


a) 2x more
b) 2x less
c) Equal
d) 10x more
Answer: b

What is a key feature of Inception modules?


a) Use of only 3x3 filters
b) Utilization of conv/pool operations with varying filter sizes
c) Elimination of pooling layers
d) Use of fully connected layers only
Answer: b

Fully Connected vs. Convolutional Layers

What is the formula for the number of weights in a fully connected layer?
a) W × H × C_out × (W × H × C_in + 1)
b) C_out × (K × K × C_in + 1)
c) W × H × C_in

Page 2 of 26
d) K × K × C_out
Answer: a

How does the number of weights in a convolutional layer compare to a fully connected layer?
a) Always higher
b) Generally lower due to weight sharing
c) Equal
d) Depends only on input size
Answer: b

What does C_in represent in the context of convolutional layers?


a) Output channels
b) Input channels
c) Kernel size
d) Layer width
Answer: b

Receptive Field and Arithmetics

What is the receptive field in a convolutional neural network?


a) The size of the input image
b) The region of the input that affects a particular output
c) The number of layers
d) The kernel size only
Answer: b

How can the receptive field be increased without pooling?


a) Using dilated convolutions
b) Increasing the number of filters
c) Reducing kernel size
d) Adding fully connected layers
Answer: a

Padding

What is the role of padding in convolution operations?


a) To increase the number of output channels
b) To preserve the spatial dimensions of the input
c) To reduce computational cost
d) To eliminate features
Answer: b

Implementation as Computation Graph

Page 3 of 26
What does a computation graph represent in the context of convolution?
a) The hardware architecture
b) The flow of data and operations
c) The loss function
d) The training dataset
Answer: b

Dilated Convolutions

What is a key advantage of dilated convolutions?


a) Reduces the receptive field
b) Increases the receptive field without upsampling/downsampling
c) Eliminates the need for convolution
b) Decreases the number of parameters
Answer: b

What is the dilation factor mentioned in the context of dilated convolutions?


a) d=1
b) d=2
c) d=5
d) d=10
Answer: b

In which application are dilated convolutions particularly useful?


a) Image classification
b) Semantic segmentation
c) Noise reduction
d) Data augmentation
Answer: b

Are dilated convolutions considered upsampling operations?


a) Yes
b) No
c) Only with pooling
d) Only with large kernels
Answer: b

Visualization

What is the purpose of visualization in convolutional networks?


a) To train the model
b) To understand feature extraction and network behavior
c) To increase the dataset size

Page 4 of 26
d) To reduce memory usage
Answer: b

VGG Architecture

What is a characteristic of fully connected layers in the VGG architecture?


a) They are the least memory-intensive
b) They are the most memory-intensive
c) They are not used
d) They reduce the receptive field
Answer: b

How many layers does the VGG architecture with 152 layers have?
a) 50
b) 152
c) 16
d) 8
Answer: b

SegNet

Which upsampling approach is used in SegNet?


a) Bilinear interpolation
b) Nearest neighbor interpolation
c) Gaussian upsampling
d) Cubic interpolation
Answer: b

General Concepts

What year was AlexNet introduced?


a) 2010
b) 2012
c) 2015
d) 2018
Answer: b

Which GPU was used to train AlexNet?


a) GTX 1080
b) GTX 580
c) RTX 2080
d) Tesla V100
Answer: b

Page 5 of 26
What activation function was used in AlexNet?
a) Sigmoid
b) ReLU
c) Tanh
d) Softmax
Answer: b

What technique was used in AlexNet to enhance training data?


a) Regularization
b) Data augmentation
c) Batch normalization
d) Gradient clipping
Answer: b

How does the spatial resolution change with depth in AlexNet?


a) Increases
b) Decreases
c) Remains constant
d) Depends on the input
Answer: b

What is a key efficiency feature of Inception modules?


a) 5x1 convolutions
b) 7x7 filters only
c) No pooling
d) Fully connected layers
Answer: a

In which year was Inception (GoogleNet) introduced?


a) 2012
b) 2014
c) 2015
d) 2016
Answer: c

What is a benefit of using intermediate classification pooling in Inception?


a) Reduces memory usage
b) Eliminates the need for fully connected layers
c) Increases the number of parameters
d) Slows down training
Answer: b

What is the total number of parameters in Inception (GoogleNet)?


a) 5 million

Page 6 of 26
b) 10 million
c) 138 million
d) 200 million
Answer: a

Which paper introduced the concept of dilated convolutions?


a) AlexNet
b) Multi-scale Context Aggregation by Dilated Convolutions
c) VGG
d) SegNet
Answer: b

What does a dilation factor of 2 do in dilated convolutions?


a) Reduces the kernel size
b) Increases the receptive field with distributed sampling
c) Eliminates the need for padding
d) Decreases the output resolution
Answer: b

Which technique does not require pooling to increase the receptive field?
a) Standard convolution
b) Dilated convolution
c) Upsampling
d) Downsampling
Answer: b

What is a common application of convolutional encoder-decoder architectures?


a) Text classification
b) Image segmentation
c) Speech recognition
d) Time series analysis
Answer: b

How many convolution layers are mentioned in a simple architecture with 5x5 filters?
a) 1
b) 2
c) 3
d) 4
Answer: b

What size pooling layers are mentioned alongside 5x5 convolution layers?
a) 3x3
b) 2x2
c) 4x4

Page 7 of 26
d) 1x1
Answer: b

What is the output of a convolution operation as shown in the diagram?


a) 2x2 grid with values 0, 5, 0, -1
b) 3x3 grid
c) 1x1 grid
d) 4x4 grid
Answer: a

What is equivariance in the context of convolutional networks?


a) The network's ability to handle different input sizes
b) The property that the output transforms in the same way as the input under certain transformations
c) The reduction of parameters
d) The increase in spatial resolution
Answer: b

Which architecture is known for very deep convolutional networks?


a) AlexNet
b) VGG
c) Inception
d) SegNet
Answer: b

What does the VGG paper by Simonyan and Zisserman discuss?


a) Large-scale image recognition
b) Text processing
c) Audio classification
d) Reinforcement learning
Answer: a

How many fully connected layers are typically used in the architectures mentioned?
a) 0
b) 1
c) 2
d) 3
Answer: c

What is a disadvantage of fully connected layers in deep networks?


a) High memory usage
b) Low computational cost
c) Small receptive field
d) No feature extraction
Answer: a

Page 8 of 26
Which layer type is three-dimensional in nature?
a) Fully connected layer
b) Convolution kernel
c) Pooling layer
d) Dropout layer
Answer: b

What happens to the number of feature channels in AlexNet with depth?


a) Decreases
b) Increases
c) Remains constant
d) Randomly varies
Answer: b

Which technique was not mentioned as part of AlexNet's training strategy?


a) ReLUs
b) Dropout
c) Batch normalization
d) Data augmentation
Answer: c

What is the kernel size used in the Inception module for efficiency?
a) 3x3
b) 5x1
c) 7x7
d) 1x1
Answer: b

How many GPUs were used to train AlexNet?


a) 1
b) 2
c) 4
d) 8
Answer: b

What is the primary goal of using varying filter sizes in Inception modules?
a) To reduce the network depth
b) To capture multi-scale features
c) To increase memory usage
d) To simplify the architecture
Answer: b

Which year saw the introduction of very deep convolutional networks like VGG?
a) 2012

Page 9 of 26
b) 2014
c) 2015
d) 2016
Answer: c

What is a key difference between FCN-8s and DeepLab as shown in the comparison?
a) FCN-8s uses dilated convolutions
b) DeepLab produces more accurate results
c) FCN-8s has no pooling
d) DeepLab uses only fully connected layers
Answer: b

What is the ground truth in the context of the image processing technique comparison?
a) The input image
b) The expected output
c) The filter kernel
d) The training data
Answer: b

Which architecture eliminates the need for fully connected layers with intermediate pooling?
a) AlexNet
b) VGG
c) Inception
d) SegNet
Answer: c

What is the effect of padding on the output size of a convolution?


a) Increases it
b) Decreases it
c) Keeps it the same as input
d) Randomly varies it
Answer: c

Which operation combines input data with a kernel in a neural network?


a) Pooling
b) Convolution
c) Upsampling
d) Normalization
Answer: b

What is a common output resolution goal of dilated convolutions?


a) Lower than input
b) Same as input
c) Double the input

Page 10 of 26
d) Half the input
Answer: b

Which technique is used to scale channels without learning parameters?


a) Convolutional layers
b) Nearest neighbor interpolation
c) Fully connected layers
d) Dilated convolutions
Answer: b

What is the main challenge with fully connected layers in deep networks?
a) Overfitting
b) High computational complexity
c) Lack of feature extraction
d) Small memory footprint
Answer: b

Which layer type is most affected by the increase in depth in VGG?


a) Convolutional layers
b) Pooling layers
c) Fully connected layers
d) Dropout layers
Answer: c

What does the diagram of dilated convolution illustrate?


a) Pooling operation
b) Distributed sampling with a dilation factor
c) Upsampling process
d) Fully connected transformation
Answer: b

Which architecture triggered the deep learning revolution?


a) VGG
b) AlexNet
c) Inception
d) SegNet
Answer: b

What is the purpose of data augmentation in AlexNet?


a) To reduce training time
b) To increase the variety of training data
c) To decrease model complexity
d) To eliminate dropout
Answer: b

Page 11 of 26
How does the Inception architecture reduce parameters?
a) By using larger filters
b) By employing 5x1 convolutions
c) By removing pooling layers
d) By increasing depth
Answer: b

What is a disadvantage of using nearest neighbor interpolation?


a) High computational cost
b) Lack of smoothness in results
c) Requires large memory
d) Incompatible with CNNs
Answer: b

Which paper by Yu and Koltun is associated with dilated convolutions?


a) Very Deep Convolutional Networks
b) Multi-scale Context Aggregation by Dilated Convolutions
c) Large Scale Image Recognition
d) SegNet Architecture
Answer: b

What is the role of ReLUs in AlexNet?


a) To normalize the input
b) To introduce non-linearity
c) To reduce the receptive field
d) To perform pooling
Answer: b

How many pooling layers are typically used with 5x5 convolution layers in the mentioned architecture?
a) 1
b) 2
c) 3
d) 4
Answer: b

What is the effect of increasing the dilation factor in convolutions?


a) Decreases the receptive field
b) Increases the receptive field
c) Keeps the receptive field constant
d) Eliminates the need for kernels
Answer: b

Which layer type is responsible for the most memory usage in VGG?
a) Convolutional layers

Page 12 of 26
b) Pooling layers
c) Fully connected layers
d) Dropout layers
Answer: c

What is a key feature of the SegNet architecture?


a) Use of fully connected layers
b) Encoder-decoder structure with upsampling
c) Elimination of convolution
d) Fixed receptive field
Answer: b

Which technique allows predictions at the same resolution as inputs without pooling?
a) Standard convolution
b) Dilated convolution
c) Downsampling
d) Upsampling
Answer: b

What is the primary focus of the VGG architecture?


a) Shallow networks
b) Very deep convolutional networks
c) Recurrent networks
d) Generative models
Answer: b

How does weight sharing benefit convolutional layers?


a) Increases memory usage
b) Reduces the number of parameters
c) Eliminates the need for training
d) Increases computational complexity
Answer: b

What is the output channel representation in the convolution kernel diagram?


a) Shown for all channels
b) Shown for one channel for clarity
c) Not shown
d) Shown only for input
Answer: b

Which year was the VGG architecture with 152 layers discussed?
a) 2012
b) 2014
c) 2015

Page 13 of 26
d) 2016
Answer: c

What is a common application of the encoder-decoder architecture?


a) Time series forecasting
b) Image segmentation
c) Text generation
d) Audio synthesis
Answer: b

What does the comparison of FCN-8s, DeepLab, and ground truth illustrate?
a) Training speed
b) Accuracy of segmentation results
c) Memory usage
d) Kernel sizes
Answer: b

Which layer type is used to capture multi-scale context in Inception?


a) Fully connected layers
b) Inception modules
c) Pooling layers
d) Dropout layers
Answer: b

What is the effect of dropout in AlexNet?


a) Increases the number of parameters
b) Prevents overfitting
c) Reduces the receptive field
d) Eliminates convolution
Answer: b

How many convolution layers are in the simple architecture with 2x2 pooling?
a) 1
b) 2
c) 3
d) 4
Answer: b

What is the kernel size of the filters in the simple architecture mentioned?
a) 3x3
b) 5x5
c) 7x7
d) 1x1
Answer: b

Page 14 of 26
Which technique is used to handle large receptive fields efficiently?
a) Fully connected layers
b) Dilated convolutions
c) Nearest neighbor interpolation
d) Downsampling
Answer: b

What is the role of the computation graph in convolution?


a) To visualize the dataset
b) To represent data flow and operations
c) To train the network
d) To store weights
Answer: b

Which architecture uses 152 layers as mentioned?


a) AlexNet
b) VGG
c) Inception
d) SegNet
Answer: b

What is a key difference between convolutional and fully connected layers?


a) Convolutional layers use weight sharing
b) Fully connected layers use pooling
c) Convolutional layers have no parameters
d) Fully connected layers reduce spatial dimensions
Answer: a

What is the purpose of the ground truth in image processing comparisons?


a) To train the model
b) To provide the expected output for evaluation
c) To define the input
d) To adjust the kernel
Answer: b

Which technique is not part of the upsampling process in SegNet?


a) Nearest neighbor interpolation
b) Bilinear interpolation
c) Dilated convolution
d) Encoder-decoder structure
Answer: c

What is the effect of increasing feature channels in AlexNet?


a) Decreases depth

Page 15 of 26
b) Enhances feature extraction
c) Reduces memory usage
d) Eliminates pooling
Answer: b

Which layer type is responsible for reducing spatial resolution in AlexNet?


a) Convolutional layers
b) Pooling layers
c) Fully connected layers
d) Dropout layers
Answer: b

What is a benefit of using 5x1 convolutions in Inception?


a) Increases the number of parameters
b) Improves efficiency
c) Reduces the receptive field
d) Eliminates depth
Answer: b

Which architecture is known for large-scale image recognition?


a) AlexNet
b) VGG
c) Inception
d) All of the above
Answer: d

What is the primary focus of the DeepLab architecture?


a) Image classification
b) Semantic segmentation
c) Audio processing
d) Text analysis
Answer: b

How does padding affect the convolution process?


a) Reduces the input size
b) Maintains the input size in the output
c) Increases the number of channels
d) Eliminates the kernel
Answer: b

What is the role of the dilation factor in convolutional networks?


a) To reduce the number of layers
b) To control the spacing of kernel elements
c) To increase the input size

Page 16 of 26
d) To eliminate pooling
Answer: b

Which technique is used to visualize features in convolutional networks?


a) Data augmentation
b) Feature map visualization
c) Dropout
d) Batch normalization
Answer: b

What is a common output of the convolution operation as per the diagram?


a) 1x1 grid
b) 2x2 grid with specific values
c) 3x3 grid
d) 4x4 grid
Answer: b

Which architecture is associated with the use of ReLUs and dropout?


a) VGG
b) AlexNet
c) Inception
d) SegNet
Answer: b

Convolutional Neural Networks (CNNs) MCQs with Solutions


Convolutional Neural Networks (CNNs)

What is the primary advantage of CNNs over traditional neural networks for image data?a) They require
less datab) They exploit spatial structurec) They eliminate the need for activation functionsd) They are
faster to trainAnswer: b) They exploit spatial structureExplanation: CNNs use convolutional layers to
capture spatial hierarchies in images.

Which type of data are CNNs primarily designed to handle?a) Tabular datab) Sequential datac) Grid-like
data (e.g., images)d) Unstructured textAnswer: c) Grid-like data (e.g., images)Explanation: CNNs are
optimized for 2D grid data like images.

What is a key feature that reduces the number of parameters in CNNs?a) Fully connected layersb)
Weight sharingc) Large filter sizesd) Increased depthAnswer: b) Weight sharingExplanation: Weight
sharing in convolutional layers reduces parameter count.

Which layer in a CNN applies filters to the input?a) Pooling layerb) Convolutional layerc) Fully connected
layerd) Output layerAnswer: b) Convolutional layerExplanation: The convolutional layer applies filters to
extract features.

Page 17 of 26
What is the purpose of CNNs in computer vision tasks?a) To perform clusteringb) To detect and classify
objectsc) To normalize datad) To reduce dimensionalityAnswer: b) To detect and classify
objectsExplanation: CNNs are widely used for object detection and classification.

Convolution

What does a convolution operation in a CNN do?a) Applies a global transformationb) Extracts local
features using filtersc) Normalizes the entire inputd) Reduces the number of layersAnswer: b) Extracts
local features using filtersExplanation: Convolution uses filters to detect local patterns.

What is the role of the filter (kernel) in a convolutional layer?a) To initialize weightsb) To slide over the
input and compute feature mapsc) To reduce spatial dimensionsd) To compute the loss
functionAnswer: b) To slide over the input and compute feature mapsExplanation: Filters generate
feature maps by convolving with the input.

What happens to the output size if no padding is used in convolution?a) It increasesb) It decreasesc) It
remains the samed) It depends on the filter sizeAnswer: b) It decreasesExplanation: Without padding,
the output size shrinks due to boundary effects.

What is the purpose of the stride parameter in convolution?a) To determine the filter sizeb) To control
the step size of the filterc) To set the learning rated) To initialize weightsAnswer: b) To control the step
size of the filterExplanation: Stride determines how far the filter moves across the input.

What effect does adding padding have on the convolutional output?a) Reduces the output sizeb)
Preserves or increases the output sizec) Eliminates the need for filtersd) Normalizes the inputAnswer:
b) Preserves or increases the output sizeExplanation: Padding maintains or adjusts the output
dimensions.

Downsampling

What is the primary purpose of downsampling in CNNs?a) To increase the number of parametersb) To
reduce spatial dimensionsc) To add noise to the datad) To increase the learning rateAnswer: b) To
reduce spatial dimensionsExplanation: Downsampling reduces computational load and overfitting.

Which technique is commonly used for downsampling in CNNs?a) Max poolingb) Dropoutc) Batch
normalizationd) Gradient clippingAnswer: a) Max poolingExplanation: Max pooling selects the maximum
value to downsample.

What is the effect of downsampling on the spatial resolution of feature maps?a) Increases resolutionb)
Decreases resolutionc) Maintains resolutiond) Eliminates resolutionAnswer: b) Decreases
resolutionExplanation: Downsampling reduces the spatial size of feature maps.

What is a disadvantage of aggressive downsampling in CNNs?a) Increased training speedb) Loss of fine
detailsc) Reduced memory usaged) Improved accuracyAnswer: b) Loss of fine detailsExplanation:
Excessive downsampling can discard important features.

Page 18 of 26
Which pooling method takes the average value in a region?a) Max poolingb) Average poolingc) Global
poolingd) Sum poolingAnswer: b) Average poolingExplanation: Average pooling computes the mean
value in a region.

Upsampling

What is the primary purpose of upsampling in CNNs?a) To reduce spatial dimensionsb) To increase
spatial dimensionsc) To normalize input datad) To initialize weightsAnswer: b) To increase spatial
dimensionsExplanation: Upsampling is used to reconstruct or enlarge feature maps.

Which technique is commonly used for upsampling in CNNs?a) Transposed convolutionb) Max poolingc)
Dropoutd) Batch normalizationAnswer: a) Transposed convolutionExplanation: Transposed convolution
upsamples feature maps.

What is the effect of upsampling on the spatial resolution of feature maps?a) Decreases resolutionb)
Increases resolutionc) Maintains resolutiond) Eliminates resolutionAnswer: b) Increases
resolutionExplanation: Upsampling enlarges the spatial size of feature maps.

In which task is upsampling commonly used in CNNs?a) Image classificationb) Image segmentationc)
Object detectiond) Data augmentationAnswer: b) Image segmentationExplanation: Upsampling helps
reconstruct pixel-level predictions in segmentation.

What is a potential drawback of upsampling in CNNs?a) Reduced computational costb) Introduction of


artifactsc) Increased accuracyd) Elimination of noiseAnswer: b) Introduction of artifactsExplanation:
Upsampling can introduce checkerboard artifacts if not handled properly.

Architectures

Which CNN architecture won the ImageNet competition in 2012?a) VGGb) AlexNetc) ResNetd)
InceptionAnswer: b) AlexNetExplanation: AlexNet’s success revitalized CNN research.

What is a key feature of the VGG architecture?a) Skip connectionsb) Small 3x3 filtersc) Parallel
convolutionsd) No pooling layersAnswer: b) Small 3x3 filtersExplanation: VGG uses stacks of small
filters for depth.

Which architecture introduced the concept of inception modules?a) AlexNetb) VGGc) GoogLeNetd)
ResNetAnswer: c) GoogLeNetExplanation: Inception modules combine multiple filter sizes.

What is a key innovation in ResNet architectures?a) Use of large filtersb) Skip connectionsc) Absence of
activation functionsd) Fixed strideAnswer: b) Skip connectionsExplanation: Skip connections mitigate
vanishing gradients.

Which architecture is known for its depth and residual learning?a) AlexNetb) VGGc) ResNetd)
InceptionAnswer: c) ResNetExplanation: ResNet uses residual blocks for deep networks.

Page 19 of 26
What is a disadvantage of very deep CNN architectures?a) Increased accuracyb) Vanishing gradientsc)
Reduced training timed) Smaller model sizeAnswer: b) Vanishing gradientsExplanation: Depth can
cause gradient issues without mitigation.

Which architecture uses a global average pooling layer before the output?a) AlexNetb) VGGc)
GoogLeNetd) ResNetAnswer: c) GoogLeNetExplanation: Global pooling reduces parameters in
GoogLeNet.

What is a characteristic of the DenseNet architecture?a) No connections between layersb) Dense


connectivityc) Use of large stridesd) Absence of poolingAnswer: b) Dense connectivityExplanation:
DenseNet connects all layers to enhance feature reuse.

Which CNN architecture is known for its simplicity and uniform structure?a) AlexNetb) VGGc) ResNetd)
InceptionAnswer: b) VGGExplanation: VGG’s uniform structure simplifies design.

What is the purpose of the bottleneck design in ResNet?a) To increase parametersb) To reduce
computational costc) To eliminate poolingd) To normalize dataAnswer: b) To reduce computational
costExplanation: Bottlenecks optimize resource usage.

Visualization

What is the purpose of visualizing feature maps in CNNs?a) To reduce model sizeb) To understand
learned featuresc) To normalize input datad) To increase training speedAnswer: b) To understand
learned featuresExplanation: Feature maps reveal what the network learns.

Which technique visualizes the importance of input regions for CNN predictions?a) Gradient descentb)
Grad-CAMc) Dropoutd) Data augmentationAnswer: b) Grad-CAMExplanation: Grad-CAM highlights
influential input regions.

What does a saliency map show in CNN visualization?a) The loss functionb) The gradient impact on the
inputc) The number of parametersd) The learning rateAnswer: b) The gradient impact on the
inputExplanation: Saliency maps show gradient effects on input pixels.

Which visualization technique helps interpret CNN filters?a) t-SNEb) Filter visualizationc) PCAd) K-
meansAnswer: b) Filter visualizationExplanation: Filter visualization shows learned patterns.

What is a common tool used to visualize CNN activations?a) TensorBoardb) Excelc) MATLABd)
PowerPointAnswer: a) TensorBoardExplanation: TensorBoard is widely used for CNN visualization.

What can visualization of CNN layers reveal?a) Training timeb) Hierarchical feature extractionc) Number
of epochsd) Learning rate scheduleAnswer: b) Hierarchical feature extractionExplanation: Layers show
progression from edges to objects.

Which technique uses backpropagation to create a heatmap?a) Max poolingb) Guided


backpropagationc) Transposed convolutiond) DropoutAnswer: b) Guided backpropagationExplanation:
Guided backpropagation generates heatmaps.

Page 20 of 26
What is a limitation of CNN visualization techniques?a) They are too fastb) They may not fully explain
decisionsc) They reduce accuracyd) They eliminate parametersAnswer: b) They may not fully explain
decisionsExplanation: Visualizations provide insights but not complete explanations.

Which visualization method highlights class-specific regions?a) Average poolingb) Class Activation
Mapping (CAM)c) Convolutiond) UpsamplingAnswer: b) Class Activation Mapping (CAM)Explanation: CAM
focuses on regions relevant to a class.

What is the purpose of occlusion sensitivity analysis in CNNs?a) To increase model sizeb) To identify
critical input regionsc) To reduce training timed) To normalize dataAnswer: b) To identify critical input
regionsExplanation: Occlusion tests reveal important areas by masking inputs.

What does a feature map visualization typically show?a) The loss valueb) Activated regions after
convolutionc) The number of layersd) The learning rateAnswer: b) Activated regions after
convolutionExplanation: Feature maps display activated areas.

Which tool can visualize CNN training progress?a) Notepadb) TensorBoardc) Wordd) PaintAnswer: b)
TensorBoardExplanation: TensorBoard tracks and visualizes training metrics.

What is a common application of CNN visualization?a) Data preprocessingb) Model debuggingc) Weight
initializationd) Loss computationAnswer: b) Model debuggingExplanation: Visualization helps debug
and interpret models.

Which technique uses gradients to produce a saliency map?a) Max poolingb) Vanilla gradientc)
Transposed convolutiond) DropoutAnswer: b) Vanilla gradientExplanation: Vanilla gradients create
saliency maps from input gradients.

What can over-visualization of CNNs lead to?a) Improved accuracyb) Overcomplication and confusionc)
Reduced training timed) Smaller model sizeAnswer: b) Overcomplication and confusionExplanation:
Excessive visualization can obscure insights.

Additional Questions

What is the effect of increasing the number of filters in a convolutional layer?a) Reduces output sizeb)
Increases feature diversityc) Eliminates poolingd) Normalizes inputAnswer: b) Increases feature
diversityExplanation: More filters capture varied features.

What is the purpose of the ReLU activation in CNNs?a) To normalize datab) To introduce non-linearityc)
To reduce parametersd) To initialize weightsAnswer: b) To introduce non-linearityExplanation: ReLU
enables complex pattern learning.

Which layer follows convolution in a typical CNN architecture?a) Input layerb) Pooling layerc) Output
layerd) Fully connected layerAnswer: b) Pooling layerExplanation: Pooling often follows convolution.

What is the effect of a larger stride in convolution?a) Increases output sizeb) Reduces output sizec)
Maintains output sized) Eliminates filtersAnswer: b) Reduces output sizeExplanation: Larger strides

Page 21 of 26
reduce spatial dimensions.

What is a common padding strategy to preserve input size?a) No paddingb) Same paddingc) Valid
paddingd) Zero paddingAnswer: b) Same paddingExplanation: Same padding maintains input
dimensions.

What is the purpose of the bias term in convolution?a) To initialize weightsb) To shift the activationc) To
reduce parametersd) To normalize dataAnswer: b) To shift the activationExplanation: Bias adjusts the
filter output.

Which CNN component helps reduce overfitting?a) Convolutionb) Dropoutc) Upsamplingd)


DownsamplingAnswer: b) DropoutExplanation: Dropout prevents overfitting.

What is the role of the activation function after convolution?a) To reduce dimensionsb) To introduce
non-linearityc) To initialize filtersd) To compute lossAnswer: b) To introduce non-linearityExplanation:
Activation adds non-linearity to feature maps.

What is the effect of increasing filter size in convolution?a) Captures smaller patternsb) Captures larger
patternsc) Reduces parametersd) Eliminates paddingAnswer: b) Captures larger patternsExplanation:
Larger filters detect broader features.

Which pooling method is invariant to small translations?a) Average poolingb) Max poolingc) Global
poolingd) Sum poolingAnswer: b) Max poolingExplanation: Max pooling is translation-invariant.

What is the purpose of upsampling in generative CNNs?a) To classify imagesb) To generate imagesc) To
reduce dimensionsd) To normalize dataAnswer: b) To generate imagesExplanation: Upsampling
reconstructs images in generative models.

Which architecture uses a "bottleneck" block?a) AlexNetb) ResNetc) VGGd) InceptionAnswer: b)


ResNetExplanation: ResNet uses bottleneck blocks for efficiency.

What is the purpose of skip connections in ResNet?a) To reduce parametersb) To mitigate vanishing
gradientsc) To increase strided) To normalize dataAnswer: b) To mitigate vanishing
gradientsExplanation: Skip connections aid gradient flow.

Which visualization technique uses class scores?a) Grad-CAMb) Saliency mapc) Filter visualizationd)
OcclusionAnswer: a) Grad-CAMExplanation: Grad-CAM leverages class scores.

What is a common application of CNN visualization?a) Data preprocessingb) Medical imaging analysisc)
Weight initializationd) Loss optimizationAnswer: b) Medical imaging analysisExplanation: Visualization
aids in interpreting medical scans.

What is the effect of no padding in convolution?a) Increases output sizeb) Decreases output sizec)
Maintains output sized) Eliminates filtersAnswer: b) Decreases output sizeExplanation: No padding
reduces output dimensions.

Page 22 of 26
Which layer combines features in a CNN before classification?a) Convolutional layerb) Fully connected
layerc) Pooling layerd) Upsampling layerAnswer: b) Fully connected layerExplanation: Fully connected
layers integrate features.

What is the purpose of the softmax layer in CNNs?a) To extract featuresb) To compute class
probabilitiesc) To reduce dimensionsd) To initialize weightsAnswer: b) To compute class
probabilitiesExplanation: Softmax outputs probabilities.

Which pooling method preserves more spatial information?a) Max poolingb) Average poolingc) Global
poolingd) Sum poolingAnswer: b) Average poolingExplanation: Average pooling retains more spatial
details.

What is the effect of increasing the number of convolutional layers?a) Reduces depthb) Increases
feature hierarchyc) Eliminates poolingd) Normalizes dataAnswer: b) Increases feature
hierarchyExplanation: More layers enhance feature complexity.

Which architecture avoids fully connected layers?a) AlexNetb) GoogLeNetc) VGGd) ResNetAnswer: b)
GoogLeNetExplanation: GoogLeNet uses global pooling instead.

What is the purpose of batch normalization in CNNs?a) To reduce parametersb) To stabilize trainingc) To
increase strided) To eliminate filtersAnswer: b) To stabilize trainingExplanation: Batch normalization
normalizes layer inputs.

Which technique visualizes the impact of input occlusion?a) Grad-CAMb) Occlusion sensitivityc)
Saliency mapd) Filter visualizationAnswer: b) Occlusion sensitivityExplanation: Occlusion tests input
importance.

What is the effect of a smaller stride in convolution?a) Increases output sizeb) Reduces output sizec)
Maintains output sized) Eliminates paddingAnswer: a) Increases output sizeExplanation: Smaller strides
produce larger outputs.

Which layer is critical for spatial invariance in CNNs?a) Convolutional layerb) Pooling layerc) Fully
connected layerd) Upsampling layerAnswer: b) Pooling layerExplanation: Pooling provides translation
invariance.

What is the purpose of the ReLU function in CNNs?a) To normalize datab) To mitigate vanishing
gradientsc) To reduce parametersd) To initialize filtersAnswer: b) To mitigate vanishing
gradientsExplanation: ReLU prevents gradient saturation.

Which architecture uses parallel convolutional paths?a) VGGb) Inceptionc) ResNetd) AlexNetAnswer: b)
InceptionExplanation: Inception uses multi-scale convolutions.

What is the effect of downsampling on computational cost?a) Increases costb) Decreases costc)
Maintains costd) Eliminates costAnswer: b) Decreases costExplanation: Downsampling reduces data
size.

Page 23 of 26
Which upsampling method avoids checkerboard artifacts?a) Nearest neighborb) Transposed
convolution with proper stridec) Max poolingd) Average poolingAnswer: b) Transposed convolution with
proper strideExplanation: Proper stride settings reduce artifacts.

What is the purpose of visualizing CNN gradients?a) To initialize weightsb) To interpret model
decisionsc) To reduce layersd) To normalize dataAnswer: b) To interpret model decisionsExplanation:
Gradients show input influence.

Which architecture is known for its residual blocks?a) AlexNetb) VGGc) ResNetd) InceptionAnswer: c)
ResNetExplanation: Residual blocks define ResNet.

What is the effect of increasing padding in convolution?a) Reduces output sizeb) Increases output
sizec) Maintains output sized) Eliminates filtersAnswer: b) Increases output sizeExplanation: More
padding expands the output.

Which pooling method is sensitive to noise?a) Max poolingb) Average poolingc) Global poolingd) Sum
poolingAnswer: b) Average poolingExplanation: Average pooling averages noise.

What is the purpose of upsampling in autoencoders?a) To encode datab) To decode datac) To reduce
dimensionsd) To normalize dataAnswer: b) To decode dataExplanation: Upsampling reconstructs input
in decoding.

Which visualization technique uses layer activations?a) Grad-CAMb) Feature map visualizationc)
Saliency mapd) OcclusionAnswer: b) Feature map visualizationExplanation: Feature maps show
activations.

What is the effect of a larger filter size on receptive field?a) Reduces receptive fieldb) Increases
receptive fieldc) Maintains receptive fieldd) Eliminates receptive fieldAnswer: b) Increases receptive
fieldExplanation: Larger filters cover more area.

Which layer precedes the output layer in CNNs?a) Convolutional layerb) Pooling layerc) Fully connected
layerd) Upsampling layerAnswer: c) Fully connected layerExplanation: Fully connected layers feed into
the output.

What is the purpose of the softmax layer in classification CNNs?a) To extract featuresb) To compute
probabilitiesc) To reduce dimensionsd) To initialize weightsAnswer: b) To compute
probabilitiesExplanation: Softmax provides class probabilities.

Which downsampling method is robust to outliers?a) Max poolingb) Average poolingc) Global poolingd)
Sum poolingAnswer: a) Max poolingExplanation: Max pooling ignores outliers.

What is the effect of upsampling on feature map detail?a) Reduces detailb) Increases detailc) Maintains
detaild) Eliminates detailAnswer: b) Increases detailExplanation: Upsampling can enhance detail.

Which architecture uses a "stem" module?a) AlexNetb) Inceptionc) ResNetd) VGGAnswer: b)


InceptionExplanation: Inception uses a stem for initial processing.

Page 24 of 26
What is the purpose of visualizing CNN weights?a) To reduce parametersb) To understand filter
patternsc) To normalize datad) To increase layersAnswer: b) To understand filter patternsExplanation:
Weights reveal learned filters.

Which technique combines gradients and activations?a) Grad-CAMb) Saliency mapc) Occlusiond) Filter
visualizationAnswer: a) Grad-CAMExplanation: Grad-CAM integrates both for visualization.

What is the effect of a smaller stride on computation?a) Increases computationb) Decreases


computationc) Maintains computationd) Eliminates computationAnswer: a) Increases
computationExplanation: Smaller strides process more data.

Which layer is optional in some CNN architectures?a) Convolutional layerb) Fully connected layerc)
Pooling layerd) Upsampling layerAnswer: b) Fully connected layerExplanation: Some architectures skip
fully connected layers.

What is the purpose of the ReLU function after pooling?a) To normalize datab) To introduce non-
linearityc) To reduce parametersd) To initialize weightsAnswer: b) To introduce non-
linearityExplanation: ReLU adds non-linearity post-pooling.

Which architecture is designed for real-time applications?a) VGGb) MobileNetc) ResNetd)


InceptionAnswer: b) MobileNetExplanation: MobileNet is optimized for efficiency.

What is the effect of downsampling on overfitting?a) Increases overfittingb) Reduces overfittingc)


Maintains overfittingd) Eliminates overfittingAnswer: b) Reduces overfittingExplanation: Downsampling
reduces model capacity.

Which upsampling method is simplest?a) Transposed convolutionb) Nearest neighborc) Bilinear


interpolationd) Max unpoolingAnswer: b) Nearest neighborExplanation: Nearest neighbor is the
simplest upsampling.

What is the purpose of visualizing CNN predictions?a) To initialize weightsb) To validate model
decisionsc) To reduce layersd) To normalize dataAnswer: b) To validate model decisionsExplanation:
Visualization confirms prediction accuracy.

Which architecture uses depthwise separable convolutions?a) VGGb) MobileNetc) ResNetd)


AlexNetAnswer: b) MobileNetExplanation: MobileNet uses separable convolutions.

What is the effect of increasing the number of layers in a CNN?a) Reduces feature hierarchyb) Increases
feature hierarchyc) Eliminates poolingd) Normalizes dataAnswer: b) Increases feature
hierarchyExplanation: More layers enhance feature complexity.

Which pooling method is used in global average pooling?a) Max poolingb) Average poolingc) Sum
poolingd) NoneAnswer: b) Average poolingExplanation: Global average pooling averages the entire map.

What is the purpose of upsampling in super-resolution?a) To classify imagesb) To enhance image


resolutionc) To reduce dimensionsd) To normalize dataAnswer: b) To enhance image

Page 25 of 26
resolutionExplanation: Upsampling improves image quality.

Which visualization technique is most interpretable for end-users?a) Grad-CAMb) Saliency mapc)
Occlusiond) Filter visualizationAnswer: a) Grad-CAMExplanation: Grad-CAM provides clear, class-
specific heatmaps.

Page 26 of 26

You might also like