0% found this document useful (0 votes)
13 views16 pages

ANN Module1

principles of AI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views16 pages

ANN Module1

principles of AI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

ANN_Module1

PART A
Q1) Explain how a biological neuron can be modeled into an
artificial neuron.
Answer:

A biological neuron consists of dendrites (to receive signals), a cell body (to
process them), and an axon (to send the output). To model this in an artificial
neuron, we replace these parts with mathematical components: inputs
correspond to dendrites, each input is multiplied by a weight (showing its
strength), then all are summed up in the processing unit (like the cell body). A bias
can also be added to control the threshold. This sum is passed through an
activation function (similar to the firing mechanism of a neuron), which decides
the final output sent forward (like the axon). This simple model forms the basis of
Artificial Neural Networks.

ANN_Module1 1
Diagram (ASCII):

x1 -- w1 \
x2 -- w2 >--( Σ + b )--[ φ ]--> y
... -- .../

Key mapping: Synaptic efficacy ↔ weight magnitude/sign; temporal integration


↔ weighted sum; firing ↔ activation function crossing threshold.

Example: For binary classification, choose ϕ=sign\phi=\text{sign}. If y=+1y=+1


when w⊤x+b>0w^\top x + b>0, the neuron “fires”.

Why this works: Captures linearity + nonlinearity (via ϕ\phi), enabling learnable
transformations of inputs that mimic excitatory/inhibitory influences.

Q2) Compare and contrast error-correction learning and Hebbian


learning.
Answer:

ANN_Module1 2
Q3) Apply the concept of competitive learning to classify a simple
set of input patterns.
Answer:

ANN_Module1 3
Q4) Solve a simple neural network problem using Boltzmann
learning and interpret the energy function.
Answer:

ANN_Module1 4
Q5) Illustrate how neural networks can be represented as
directed graphs.
Answer:
Directed graph representation:

Vertices: Neurons (units).

Directed edges: Weighted connections Wij from i —> j

Layers: Acyclic (feedforward) vs cyclic (recurrent).

Example (3–2–1 MLP):

x1 → (h1) → y
x2 → (h2) ↗

Edges encode signal flow and causal order (feedforward DAG). Recurrent nets
add cycles, enabling state/memory.

Why graphs matter: Formalize topology, facilitate algorithms (topo sort, backprop
graph), and enable visualization/pruning.

Q6) Discuss the significance of the credit assignment problem in


neural learning.
Answer:

Definition: Determining which parameters/units caused an observed error and


by how much.
Significance:

In deep/recurrent nets, many transformations lie between input and loss.

Without principled credit assignment, learning is slow/unstable.

Solutions:

Backpropagation (gradient descent): Uses chain rule to distribute partial


derivatives layer-wise.

Temporal credit (RL): TD-learning, eligibility traces.

ANN_Module1 5
Architectural aids: Residual connections, normalization, attention.

Regularization & attribution: Dropout, weight decay, Shapley/GI for


interpretability (post-hoc).

Q7) Analyze the statistical nature of the learning process in


artificial neural networks.
Answer:
Statistical framing:

Empirical risk minimization: min⁡θ1N∑iℓ(fθ(xi),ti)\min_\theta \frac{1}{N}\sum_{i}


\ell(f_\theta(x_i),t_i).

Stochastic gradients: Updates use mini-batches → noisy but unbiased


gradient estimates.

Bias–variance trade-off: Model capacity vs generalization.

Regularization as priors: L2 ↔ Gaussian prior; L1 ↔ Laplace prior.

Generalization controls: Early stopping, cross-validation, data augmentation.

Implication: Learning is probabilistic, with uncertainty from finite data, noise, and
optimization stochasticity.

Q8) Describe various network architectures and identify their


typical applications.
Answer:

Single-layer perceptron: Linear separator; quick classifiers.

MLP (feedforward): Universal approximator; tabular data, function


approximation.

CNN: Local connectivity + weight sharing; images/vision.

RNN/Hopfield/Elman: Feedback/memory; sequences, associative memory.

SOM (Kohonen): Topology-preserving maps; clustering/visualization.

Autoencoders: Representation learning, denoising.

ANN_Module1 6
Boltzmann/EBMs: Generative modeling, combinatorial structures.

Q9) Given a scenario, identify which learning rule (Hebbian, error


correction, etc.) is most appropriate and justify.
Answer (three mini-scenarios):

1. Unlabeled sensory data → feature discovery: Hebbian / competitive / SOM


(no targets; correlations matter).

2. Labeled classification with clear targets: Error-correction / Delta / backprop


(objective-driven).

3. Learning data distribution, capturing co-occurrence patterns: Boltzmann


learning/contrastive divergence (probabilistic modeling).

Justification: Match signal availability (labels vs none) and objective (features vs


predictive accuracy vs likelihood).

Q10) Explain the role of memory and adaptation in the learning


process of neural networks.
Answer:

Memory:

Parametric: Stored in weights WW after training.

Stateful: Hidden states in RNNs/Hopfield store recent/associative


information.

Adaptation:


Online updates W←W−η ℓW \leftarrow W - \eta \nabla \ell track non-
stationary environments.

Plasticity rules (Hebbian/Oja) modulate representations.

Outcome: Memory enables retention, adaptation enables responsiveness—


together yielding learning systems that remain useful as data evolve.

PART B

ANN_Module1 7
Q1) What is a neural network? Explain its architecture and
components in detail.
A neural network is a computational model inspired by the structure and
functioning of the human brain. It is composed of interconnected processing
elements called neurons, which collectively work to solve problems like
classification, regression, and pattern recognition. Neural networks are capable of
learning from data by adjusting internal parameters (weights and biases) to
minimize errors in prediction.

Architecture and Components:

1. Input Layer: Receives raw data (features). Each input node corresponds to
one feature.

2. Hidden Layers: Intermediate layers where weighted sums are computed,


followed by non-linear activation functions. These layers enable the network
to capture complex, non-linear relationships.

3. Output Layer: Produces the final prediction or decision. The number of output
nodes depends on the problem type (e.g., 1 for binary classification, multiple
for multi-class).

4. Weights: Parameters that control the strength of connections between


neurons. Each input is multiplied by its weight before aggregation.

5. Bias: An additional parameter that shifts the activation function, allowing


flexibility.

6. Activation Functions: Non-linear functions like Sigmoid, Tanh, or ReLU that


determine whether a neuron should be “activated” or not. They provide the
non-linearity essential for solving complex tasks.

7. Loss Function: Measures the difference between the predicted output and the
target value (e.g., Mean Squared Error, Cross-Entropy).

8. Learning Algorithm: Typically gradient descent, which updates weights and


biases to minimize the loss.

Diagram:

ANN_Module1 8
Input Layer → Hidden Layer(s) → Output Layer
x1,x2 h1,h2,h3 y

Q2) Describe the biological neural network and its comparison


with artificial neuron models.
A biological neural network refers to the network of neurons in the human brain
and nervous system. A biological neuron consists of three main parts:

Dendrites: Receive incoming signals.

Soma (cell body): Processes signals.

Axon: Transmits the processed signal to other neurons via synapses.

When the input exceeds a threshold, the neuron fires an electrical impulse (action
potential), which travels along the axon and influences other neurons.

Artificial Neuron Model:


Artificial neural networks (ANNs) attempt to mimic this behavior computationally.
Inputs are given as numerical values, multiplied by weights (synaptic strength),
added with a bias, and then passed through an activation function to produce the
output.
Comparison:

Biological neurons work through electrochemical signals; artificial neurons


operate using mathematical computations.

Biological neurons learn by changing synaptic strengths through chemical


processes; artificial neurons adapt weights through algorithms like
backpropagation.

The brain has billions of neurons connected massively in parallel, while


artificial neural networks are comparatively smaller but computationally
efficient.

Biological neurons process information continuously in real-time, whereas


ANNs work in discrete computational steps.

ANN_Module1 9
Q3) Explain the types of neural network architectures with
diagrams.
Neural network architectures vary depending on connectivity, flow of data, and
application.

1. Feedforward Neural Networks (FNN):


Information flows in one direction — from input to hidden to output. No cycles.
Suitable for classification, regression, and recognition tasks.

Input → Hidden → Output

2. Recurrent Neural Networks (RNN):


Connections form cycles, allowing the network to retain memory of past
inputs. Used in speech recognition, time-series prediction, and natural
language processing.

x_t → h_t → y_t



h_{t-1}

3. Convolutional Neural Networks (CNN):


Specialized for spatial data like images. They use convolutional layers for
feature extraction and pooling layers for down-sampling.

Image → Convolution → Pooling → Fully Connected → Output

4. Self-Organizing Maps (SOM):


Unsupervised architecture where neurons compete to represent input data.
Used in clustering and visualization.

5. Autoencoders:
Symmetrical networks used for dimensionality reduction and feature learning.
They consist of an encoder and a decoder.

ANN_Module1 10
Q4) Describe Hebbian learning and its significance in neural
network training.
Definition: Hebbian learning is an unsupervised learning rule inspired by
biological neurons. It states: “Neurons that fire together, wire together.” In other
words, the connection (synaptic weight) between two neurons is strengthened
when both are active simultaneously.
Mathematical Formulation:
Δwi=η xi y\Delta w_i = \eta \, x_i \, y

where xix_i = input, yy = output, and η\eta = learning rate.


Significance:

Hebbian learning forms the foundation of associative memory and self-


organization in neural networks.

It is biologically plausible and explains how correlations between inputs and


outputs can form.

It is particularly useful for unsupervised feature learning, competitive learning,


and self-organizing maps.

A normalized version (Oja’s rule) avoids unbounded weight growth and helps
networks learn principal components.

Q5) Discuss memory-based and error-correction learning rules


with suitable examples.
Memory-Based Learning:

In memory-based learning, the system stores training instances and makes


predictions by comparing new inputs with stored examples.

Example: k-Nearest Neighbors (k-NN), where classification is based on the


majority label among the nearest stored samples.

Neural equivalent: Radial Basis Function (RBF) networks, where the centers
act as stored prototypes.

Error-Correction Learning:

ANN_Module1 11
In error-correction learning, the network adjusts weights based on the
difference between the actual output and the target output.

Example: Perceptron Learning Rule and Delta Rule.

Formula:

where t = target, y = predicted output.

Comparison:

Memory-based learning is simple but computationally expensive during


prediction.

Error-correction learning is more efficient and generalizes better when enough


training data is available.

Q6) What is competitive learning? Explain its algorithm and


applications.
Definition: Competitive learning is an unsupervised learning technique where
neurons compete to respond to an input, and only one neuron (the winner) gets
updated. This is also called winner-take-all (WTA) learning.

Algorithm:

1. Present input vector xx.

2. Compute similarity between xx and each neuron’s weight vector wjw_j.

3. Select the neuron with the highest similarity (winner).

4. Update only the winner’s weights:

Applications:

ANN_Module1 12
Data clustering and vector quantization.

Self-organizing feature maps.

Image compression.

Pattern recognition and speech coding.

Q7) Explain Boltzmann learning and how it differs from Hebbian


learning.
Boltzmann Learning:
It is a stochastic learning rule used in Boltzmann Machines. These networks are
energy-based models where the learning aims to minimize the energy of desired
states and maximize the energy of undesired states.

Here, the network adjusts weights to make the model distribution closer to the
data distribution.
Difference from Hebbian Learning:

Hebbian learning is local and correlation-based, while Boltzmann learning is


probabilistic and considers both positive (data) and negative (model) phases.

Boltzmann learning explicitly shapes the probability distribution, whereas


Hebbian focuses only on strengthening connections between co-active
neurons.

Q8) Discuss the Credit Assignment Problem in neural networks


and possible solutions.
The Credit Assignment Problem (CAP) refers to the difficulty of determining how
to assign responsibility (credit or blame) for the network’s output error to

ANN_Module1 13
individual weights and neurons, especially in deep or recurrent networks.
Challenges:

Many layers obscure the contribution of individual neurons.

In time-dependent tasks, earlier states must be assigned credit for later


outcomes.

Solutions:

1. Backpropagation: Uses the chain rule of calculus to assign error gradients to


each weight.

2. Backpropagation Through Time (BPTT): Extension of backpropagation for


recurrent networks.

3. Reinforcement Learning: Uses temporal difference learning and eligibility


traces to assign credit in delayed reward problems.

4. Architectural Methods: Residual connections and attention mechanisms


simplify credit assignment.

Q9) What is the role of memory and adaptation in learning


systems?
Memory and adaptation are fundamental to making neural networks intelligent and
flexible.

Memory: Neural networks store knowledge in their weights and sometimes in


hidden states (like RNNs). Associative memory models (Hopfield networks)
explicitly store patterns and recall them when partial inputs are given.

Adaptation: Learning is a process of continuous adaptation. As new data


arrives, weights are updated, ensuring the model remains relevant. Adaptation
allows handling non-stationary environments and concept drift.

Together, memory ensures retention of past experiences, while adaptation allows


flexibility to adjust to new conditions.

Q10) Describe how neural networks are related to Artificial


Intelligence and knowledge representation.

ANN_Module1 14
Neural networks are a crucial part of modern Artificial Intelligence (AI) because
they enable knowledge representation through learned features. In traditional AI,
knowledge was symbolic, rule-based, and manually crafted. Neural networks
instead learn distributed representations, where concepts are represented as
patterns of activity across many neurons.

In vision, CNNs learn hierarchical features (edges → shapes → objects).

In NLP, embeddings capture semantic relationships between words.

In reinforcement learning, networks approximate value functions and policies.

Thus, neural networks provide a flexible, powerful method of encoding, storing,


and applying knowledge — forming the backbone of AI systems today.

# Part B – Formulas (Module 1)

# Neuron output
y = φ( Σ (w_i * x_i) + b )

# Sigmoid activation
σ(z) = 1 / (1 + e^(-z))

# Tanh activation
tanh(z) = (e^z - e^(-z)) / (e^z + e^(-z))

# ReLU activation
ReLU(z) = max(0, z)

# Loss functions
MSE = (1/N) * Σ ( (y_i - t_i)^2 )
CrossEntropy = - Σ (t_i * log(y_i))

# Hebbian learning rule


Δw_i = η * x_i * y

# Oja’s rule (normalized Hebbian)

ANN_Module1 15
Δw = η * ( y * x - y^2 * w )

# Error-correction learning (Delta rule)


Δw = η * (t - y) * x

# Competitive learning update


w_k ← w_k + η * (x - w_k)

# Boltzmann energy function


E(s) = -0.5 * Σ Σ (w_ij * s_i * s_j) - Σ (θ_i * s_i)

# Boltzmann weight update


Δw_ij = η * ( ⟨s_i * s_j⟩_data - ⟨s_i * s_j⟩_model )

# Empirical Risk Minimization (ERM)


R_emp(θ) = (1/N) * Σ ( ℓ(f_θ(x_i), t_i) )

# Gradient descent update


w ← w - η * (∂ℓ/∂w)

— By K. Shashank

ANN_Module1 16

You might also like