0% found this document useful (0 votes)
9 views12 pages

Deep Learning Sample Questions...

Uploaded by

ritammaiti2016
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views12 pages

Deep Learning Sample Questions...

Uploaded by

ritammaiti2016
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Deep Learning

1. Explain the main differences between traditional machine learning and deep learning in terms
of data representation and feature extraction.
2. Discuss the four major reasons why deep learning is preferred over traditional machine
learning approaches in modern applications.
3. Describe the core components of a Convolutional Neural Network (CNN) and explain their
roles.
4. What is the significance of ReLU and its variants (Leaky ReLU, Parametric ReLU, Noisy ReLU) in
deep learning?
5. Discuss the vanishing and exploding gradient problems in deep networks and mention two
methods to address them.
6. How has deep learning contributed to advancements in medical image analysis? Give at least
two examples from the paper.

ANNs
1. Explain how the structure of a biological neuron is analogous to an artificial neuron.
2. What are the key characteristics of biological neural networks that inspired artificial neural
networks?
3. Explain the architecture and working of a single-layer perceptron.
4. What is the perceptron learning rule? Derive it with a suitable example.
5. Why can a single-layer perceptron not solve the XOR problem?
6. With a diagram, explain linear separability in the context of perceptron.
7. Describe the structure of a multilayer perceptron. How is it different from a single-layer
perceptron?
8. Explain the backpropagation learning algorithm in a multilayer network.
9. What are the roles of hidden layers in a multilayer neural network?
10. What is a Hopfield network? How does it perform associative memory recall?
11. Explain how a Bidirectional Associative Memory (BAM) works. How is it different from
Hopfield networks?
1. Explain why the perceptron can only solve linearly separable problems. Give an example
where it fails Why can’t a single-layer perceptron solve the XOR problem? What kind of network
is needed instead?
2. What does it mean when we say that a perceptron defines a "hyperplane" in
ndimensional space? How does this relate to decision boundaries.
3. What role does the bias term play in the perceptron model? What happens if it is
omitted?
4. What are the different Activation function present in ANN?

5. Apply one epoch of the Perceptron Learning Algorithm for the logical AND operation using
the following:
6. Inputs: x₁, x₂ = (0, 0), (0, 1), (1, 0), (1, 1)
7. Desired Output (Yd): [0, 0, 0, 1]
8. Initial Weights: w₁ = 0.3, w₂ = 0.2
9. Bias = 0 (optional, can be included if needed)
10. Activation Function: Hard limiter (i.e., if weighted sum > 0, Y = 1; else Y = 0) Learning Rate
(η): 0.1 Perform only one epoch (i.e., one full pass over all four input patterns), and compute.

11. Why is training the hidden layer in a neural network more complex than training the
output layer? Explain how this limitation is addressed in modern neural network architectures.

12. Explain why backpropagation in a neural network requires differentiable activation


functions. What role does this requirement play in updating the weights during the backward
phase?

13. 8.
MLE
1. Explain the difference between Maximum Likelihood Estimation (MLE) and Maximum A
Posteriori (MAP) estimation. Provide a scenario where MAP might be preferred.
2. Derive the MLE for the mean μ of a univariate Gaussian distribution given a known variance σ2.
3. What is a biased estimator? Discuss with respect to the MLE of variance for a Gaussian
distribution.
4. State and explain the general steps involved in the Maximum Likelihood Estimation (MLE)
procedure.
5. Differentiate between biased and unbiased estimators of variance. Why is the MLE of variance
biased?
6. What are conjugate priors? Why are they used in Bayesian estimation of Gaussian
distributions?
7. What are the limitations of MLE when the true distribution is not Gaussian? How can this
structural error affect the results?
Regularization
1. Explain the concept of regularization in neural networks. Why is it important in preventing overfitting?
2. Compare and contrast hard constraint and soft constraint views of regularization. Provide suitable
optimization formulations for each.
3. How does the Bayesian view interpret regularization? Derive the relationship between Maximum A
Posteriori (MAP) estimation and regularization.
4. Derive the effect of L2 regularization (weight decay) on the gradient descent update rule. How does it
influence the convergence?
5. Discuss how L1 regularization induces sparsity in the model parameters. Include relevant mathematical
justification.
6. What is early stopping in neural network training? Explain how it functions as a regularization method,
along with its pros and cons.
7. Describe the dropout technique used in regularization. How does it differ from L2 regularization in
preventing overfitting?
8. Explain the role of noise in regularization. How does adding noise to inputs or weights relate to weight
decay?
9. What is internal covariate shift? How does batch normalization address this issue in deep learning
models?
10. Compare different regularization methods: L2, dropout, early stopping, and data augmentation. When
would you choose one over the others?

Optimization
1. Explain the difference between expected risk and empirical risk in the context of deep learning. Why is
minimizing expected risk not feasible in practice?
2. Differentiate between batch, stochastic, and mini-batch gradient descent. Discuss their advantages and
disadvantages in deep learning optimization.
3. What are convex functions and convex sets? Why are they important in optimization problems,
particularly in deep learning?
4. Compare and contrast the following optimization algorithms: AdaGrad, RMSProp, and Adam. Highlight
the core ideas and problems they address.

CNNs
1. Explain how convolution is applied on images in Convolutional Neural Networks. Include the concept of
filters, stride, and padding.
2. What is the purpose of pooling layers in CNNs? Explain with an example how max pooling works and its
effect on the feature map.
3. How does zero padding help in convolutional operations? Illustrate with a numerical example and
explain its impact on spatial dimensions.
4. What are the advantages of using 1x1 convolution layers in deep neural networks? Explain with
reference to computational and memory benefits.
5. Explain the concept and impact of Dropout in neural networks. How does it help in regularizing CNNs?
6. Calculate the output volume size for a convolutional layer with a 32x32x3 input, 10 filters of size 5x5,
stride 1, and padding 2. Show all steps.
7. What is the difference between fully connected layers and convolutional layers in a CNN? In what
scenarios might fully connected layers be removed?
Autoencoder
1. What is an autoencoder? Explain its architecture with the roles of encoder and decoder.
2. Compare autoencoders with general-purpose data compression techniques like JPEG or MP3.
3. How autoencoders differ from General Data Compression?
4. Explain how the representational capacity of encoder and decoder affects the learning in
autoencoders.
5. Describe the concept of embedding and manifold learning in the context of autoencoders.
6. Why is it important to prevent an autoencoder from perfectly copying its input to output?
7. Describe how a variational autoencoder (VAE) works and its advantage over standard
autoencoders.
8. What is an undercomplete autoencoder and how does it relate to PCA?
9. What is the role of the sparsity penalty term Ω(h) in sparse autoencoders?
10. Describe the role of regularization in autoencoders and mention any three types of
regularized autoencoders.
11. What are sparsity-inducing priors? Give an example and its interpretation in autoencoders.
12. What is the significance of using deep or stacked autoencoders instead of shallow ones?
13. Explain how autoencoders relate to joint distributions in the context of generative modeling.
14. What are stochastic encoders and decoders? Explain their role in autoencoders.
AUTOENCODER:
2-Marks Questions (Short Conceptual / Objective Type)
1. What is the primary difference between an undercomplete autoencoder and a regularized
autoencoder?

2. Why do autoencoders fail when the hidden layer dimension is equal to or greater than the
input dimension?

3. State two reasons why autoencoders are considered lossy compression techniques.

4. What role does the reparameterization trick play in training Variational Autoencoders (VAEs)?

5. Name any two regularization strategies used in autoencoders and briefly mention their
effects.

6. How does a denoising autoencoder learn the data distribution implicitly?

7. What does the sparsity penalty Ω(h) encourage in the latent representation of a sparse
autoencoder?

8. Why is a sparse encoder’s regularizer not treated as a Bayesian prior?

9. What type of loss function is used in a stochastic decoder when the output is binary?

10. What’s the key architectural difference between a regular autoencoder and a stochastic
autoencoder?

3-Marks Questions (Explain / Justify / Small Derivation)


11. Explain with a diagram how an autoencoder works, identifying the encoder, decoder, and
hidden code.

12. Describe how autoencoders relate to PCA. When does an autoencoder behave like PCA?

13. Why might a deep autoencoder be preferred over a shallow one, despite the universal
approximation theorem?

14. Discuss one scenario where an autoencoder can learn the identity function without learning
meaningful representations.

15. Give an example of how a denoising autoencoder handles corrupted inputs and reconstructs
the original data.

16. What is the impact of using a Laplace prior in a sparse autoencoder? Derive the penalty term
Ω(h).

17. Why is it necessary to limit the capacity of f(x) and g(h) functions in autoencoder design?

18. State and explain the loss function used in a contractive autoencoder and its interpretation.

19. Explain the concept of embedding and how autoencoders help in learning lowdimensional
embeddings.

20. What are the implications of treating latent variables in autoencoders as probability
distributions?
General Regularization Questions:

1. What is the primary objective of applying regularization in a neural network, and how does it
affect the bias-variance trade-off?

2. Differentiate between explicit and implicit regularization with one example each.

3. Why is regularization considered a form of inductive bias in machine learning?

Norm Constraint / L1 & L2 Regularization:

4. Explain the difference in the optimization geometry of L1 and L2 regularization using their norm
definitions.

5. Show how L2 regularization modifies the loss function and its gradient.

6. Why does L1 regularization lead to sparsity in learned weights, whereas L2 does not? Illustrate
with a brief explanation of the penalty landscape.
Data Augmentation:
7. Why is data augmentation considered a form of regularization, and how does it reduce overfitting
without modifying the loss function?

8. Explain how data augmentation helps in improving generalization error. Provide two real-world
examples.
Early Stopping: 9. Explain early stopping as a regularization technique. How is the validation set
used in this process?

10. Why is early stopping said to prevent overfitting even when no explicit penalty term is added to
the loss?
Dropout:
11. What is the mathematical intuition behind dropout? How does it approximate model averaging?

12. During testing, why is it necessary to scale activations after dropout is turned off? Batch
Normalization:
13. Is batch normalization a form of regularization? Justify your answer with reasoning related to
noise and mini-batch fluctuations.

14. How does batch normalization impact the internal covariate shift and learning speed?

CNN:
1. (Definition – 2 marks)
What is a Convolutional Neural Network (CNN), and how does it differ from a traditional fully connected
neural network?

2. (Architecture – 3 marks)
Draw and explain the standard architecture of a CNN, including convolutional, pooling, and fully
connected layers.

4. (Activation – 2 marks)
Write down the ReLU, Leaky ReLU, and Sigmoid activation functions and briefly explain the role of
activation functions in CNNs.

5. (Convolution – 3 marks)
Given a 4×4 grayscale image and a 2×2 kernel, with stride 1 and no padding, perform one step of the
convolution and show the output feature map.

7. (Parameter Sharing – 2 marks)


What is meant by parameter sharing in CNNs, and how does it reduce the number of learnable
parameters?

8. (Loss Functions – 3 marks)


Define and differentiate between Cross-Entropy Loss and Euclidean Loss. Provide their mathematical
equations.

10. (Regularization – 3 marks)


Explain how dropout and batch normalization help in reducing overfitting in CNNs.

11. (Optimizer – 3 marks)


Explain how the Adam optimizer works. Provide its mathematical update rule combining momentum
and RMSProp.

12. (Vanishing Gradient – 2 marks)


What is the vanishing gradient problem in CNNs? How do ReLU and batch normalization help mitigate it?

13.(Advanced Arch. – 4 marks)

Compare the architecture and improvements of AlexNet, VGGNet, and ResNet. Mention depth, filter
sizes, and unique features.

14. (Data Flow – 3 marks)


For an RGB image of size 32×32×3, passed through a convolution layer with 6 filters of size 5×5, stride 1
and no padding:

• Calculate the output feature map size.


• How many parameters does the layer have (excluding bias)?
15. (Softmax – 3 marks)
Given an output vector of logits [2.0, 1.0, 0.1], apply the softmax function and find the class
probabilities.

VAEs
1. Explain the role of latent variables in Variational Autoencoders (VAEs). How does the VAE use the latent
space during inference and generation?
2. What is the Evidence Lower Bound (ELBO) in the context of VAEs? Derive the ELBO and explain its two
main components.
3. Discuss the limitations of the score function estimator for computing gradients in variational inference.
How does VAE address these limitations?
4. What is the reparameterization trick in VAEs? Illustrate it using the Gaussian distribution and explain
how it enables backpropagation.
5. Describe the AEVB (Auto-Encoding Variational Bayes) algorithm. What are the key components that
make it suitable for training VAEs on large datasets?

Transfer Learning
1. Define Transfer Learning. Explain its importance in deep learning with an example.
2. Differentiate between Inductive, Transductive, and Unsupervised Transfer Learning. Provide a practical
example for each.
3. What are the major challenges in Transfer Learning? Discuss how the concepts of 'What to transfer',
'When to transfer', and 'How to transfer' address these challenges.
4. Explain the concept of Representation Learning in the context of Transfer Learning. Why are early CNN
layers considered more general?
5. Describe the fine-tuning process in Transfer Learning. Why is it important to freeze certain layers initially
and use differential learning rates?
Group - A (Compulsory Question)

1.

(a) Differentiate between Perceptron and a Multilayer Feedforward Neural Network. (3 marks)

(b) Given a 5×5 input matrix and a 3×3 filter (kernel) with a stride of 1 and no padding. What is the size of
the feature map that you will get after one convolution operation? Show computation. (3 marks)

(c) What are the challenges in neural networks optimization? (4 marks)

(d) What is an autoencoder and what are its applications? (3 marks)

(e) Why is dropout used during the training of neural networks? (3 marks)

(f) Discuss any two applications of deep neural networks. (4 marks)

(g) What is the bias-variance tradeoff? How does regularization mitigate overfitting? (10 marks)
Group - B (Attempt any 4 question)

2.

(a) Describe the structure of a feed forward neural network with appropriate diagram. (7 marks)

(b) How does the backpropagation algorithm work? Also, explain the chain rule of calculus with respect to
backpropagation. (8 marks)

3.
(a) Discuss Adam and RMSProp optimization techniques. (6 marks)
(b) What is the role of activation functions in neural networks? (9 marks)
Write the activation function for the following:
(i) Sigmoid
(ii) ReLU
(iii) Linear

Also compute the output on the above activation functions when input is x=2.

4.

(a) Explain architecture and working of a CNN, Discuss the role of dropout and pooling. (10 marks)

(b) Give two benefits of using convolutional layers instead of fully connected ones for visual tasks. (5
marks)

(a) Draw a basic architecture diagram of a Generative Adversarial Network (GAN). (3 marks)

(b) Differentiate between GAN and Variational Autoencoder (VAE) based on the following criteria:

(i) Architecture

(ii) Training objective

(iii) Output quality

(iv) Latent space characteristics

(v) Use cases (5 marks)

c) How Autoencoders differ from General Data Compression. Compare undercomplete and denoising
variants. (7 marks)
7.

(a) Explain any 3 SOTA Models (3 marks)

(b) What is Transfer Learning? What are the strategies in Transfer Learning? (7 marks)

(c) What are applications in Prompt Engineering? (5 marks)

8.

a) Explain the concept of regularization. Explain it’s different types. (7 marks)

b) Explain i) Jail Breaking ii) Prompt Leaking (4 marks)

c) What is the role of Prompt Engineering in Transfer Learning? (4 marks)

You might also like