0% found this document useful (0 votes)

50 views8 pages

Neural Networks & Deep Learning - Study Notes

Uploaded by

xedac78301

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views8 pages

Neural Networks & Deep Learning - Study Notes

Uploaded by

xedac78301

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Deep Learning Notes - January 2025

Neural Networks & Deep Learning

From Perceptrons to Deep Architectures - My Learning Journey

Big Picture: Neural Networks are inspired by the human brain, using
layers of interconnected "neurons" to learn complex patterns. Deep
Learning = Neural Networks with many layers!

1. The Biological Inspiration

Just like our brain has ~86 billion neurons connected by synapses, artificial
neural networks have nodes (neurons) connected by weights. The magic
happens when these simple units work together!

Key Insight: Each neuron does something simple, but together they
can approximate ANY function (Universal Approximation Theorem)

2. The Perceptron - Where It All Began

Single Perceptron Model:

Output = activation(Σ(wi × xi) + bias) where: wi = weights

xi = inputs bias = threshold adjustment

Input Layer Perceptron Output

x1 ----w1----\
\
x2 ----w2----- [Σ → f()] → y
/
x3 ----w3----/
+
bias

Limitations of Single Perceptron:

Can only solve linearly separable problems

XOR problem exposed this limitation!

Solution? Stack multiple layers → Multi-Layer Perceptron (MLP)

3. Anatomy of a Neural Network

Essential Components:

Input Layer - Raw features (pixels, words, numbers)

Hidden Layers - Where the learning happens

Output Layer - Final predictions

Weights & Biases - The parameters we learn

Activation Functions - Add non-linearity

Simple Neural Network Architecture: Input Hidden Layer 1 Hidden Layer 2

Output O ---------> O \ O -----------> O O ---------> O - - - - -> O |
\ / \ v O ---------> O - - - -> O O ---------> [y] \ / / O ---------> O
- - -> O ------ [784 inputs] [128 neurons] [64 neurons] [10 classes]

4. Activation Functions - Adding Non-linearity

Without activation functions, even deep networks would just be linear

transformations!

Common Activation Functions:

1. ReLU (Rectified Linear Unit) - My go-to for hidden layers!

f(x) = max(0, x)

Pros: Simple, no vanishing gradient, fast Cons: Dead neurons problem

2. Sigmoid - Classic, outputs between 0 and 1

f(x) = 1 / (1 + e^(-x))

Use case: Binary classification output layer Issue: Vanishing gradients

in deep networks

3. Tanh - Centered around zero

f(x) = (e^x - e^(-x)) / (e^x + e^(-x))

Better than sigmoid for hidden layers

4. Softmax - For multi-class output

f(xi) = e^xi / Σ(e^xj)

Outputs probability distribution

5. Forward Propagation

The journey of data through the network:

1. Input data enters

2. Multiply by weights, add bias

3. Apply activation function

4. Pass to next layer

5. Repeat until output

Layer output: a[l] = activation(W[l] × a[l-1] + b[l])

6. Backpropagation - The Learning Magic

The Chain Rule is Everything!

Backprop = Computing gradients using chain rule + Gradient descent
Steps in Backpropagation:

1. Forward Pass: Compute predictions

2. Calculate Loss: How wrong were we?

3. Backward Pass: Compute gradients using chain rule

4. Update Weights: W = W - α × ∂L/∂W

Weight Update Rule: W[l] = W[l] - α × ∂L/∂W[l] b[l] = b[l] -

α × ∂L/∂b[l] where α = learning rate

7. Loss Functions

For Regression:

MSE : L = (1/n) × Σ(y_true - y_pred)²

MAE : L = (1/n) × Σ|y_true - y_pred|

For Classification:

Binary Cross-Entropy : -Σ(y×log(p) + (1-y)×log(1-p))

Categorical Cross-Entropy : -Σ(y×log(p))

8. Optimization Algorithms

Gradient Descent is great, but we can do better!

Evolution of Optimizers:

SGD (Stochastic Gradient Descent) - The classic

Momentum - Adds velocity to updates

RMSprop - Adaptive learning rates

Adam - Combines momentum + RMSprop (my favorite!)

Adam Update: m = β1×m + (1-β1)×gradient v = β2×v + (1-

β2)×gradient² W = W - α × m/(√v + ε)
9. Regularization Techniques

Fighting Overfitting: Great training accuracy but poor test accuracy?

Time to regularize!

Key Techniques:

L1/L2 Regularization - Add penalty to loss function

Dropout - Randomly "turn off" neurons during training

Early Stopping - Stop when validation loss increases

Batch Normalization - Normalize inputs to each layer

Data Augmentation - Create more training data

10. Deep Learning Architectures

Convolutional Neural Networks (CNNs)

Specialized for images - use convolution operations

CNN Architecture: Input → Conv → Pool → Conv → Pool → Flatten → Dense →

Output (Image) (Feature Maps) (Reduced) (Classification)

Recurrent Neural Networks (RNNs)

For sequential data - have memory!

Vanilla RNN - Simple but suffers from vanishing gradients

LSTM - Long Short-Term Memory (gates solve gradient problem)

GRU - Gated Recurrent Unit (simpler than LSTM)

Transformer Architecture

The revolution in NLP - "Attention is all you need"

Self-attention mechanism allows model to focus on relevant parts of

input
11. Training Tips & Tricks

Personal Best Practices:

Start with a small network, gradually increase complexity

Always monitor training AND validation loss

Learning rate is crucial - try 0.001 as starting point

Batch size affects convergence - powers of 2 work well

Save checkpoints regularly!

12. Common Problems & Solutions

Vanishing Gradients:

Use ReLU instead of sigmoid/tanh

Proper weight initialization (Xavier/He)

Batch normalization

Exploding Gradients:

Gradient clipping

Proper weight initialization

Lower learning rate

Overfitting:

More data!

Dropout layers

L1/L2 regularization

Reduce model complexity

13. PyTorch Implementation Snippet

Simple Neural Network in PyTorch:

import [Link] as nn

class SimpleNN([Link]):
def __init__(self):
super().__init__()
self.fc1 = [Link](784, 128)
self.fc2 = [Link](128, 64)
self.fc3 = [Link](64, 10)
[Link] = [Link]()
[Link] = [Link](0.2)

def forward(self, x):

x = [Link](self.fc1(x))
x = [Link](x)
x = [Link](self.fc2(x))
x = [Link](x)
x = self.fc3(x)
return x

14. Hyperparameter Tuning

Hyperparameters to Tune (in order of importance):

1. Learning rate

2. Number of layers & neurons

3. Batch size

4. Dropout rate

5. Activation functions

6. Optimizer choice

15. My Learning Resources

Deep Learning by Ian Goodfellow (the bible!)

[Link] courses - practical approach

3Blue1Brown neural network series - visual intuition

Papers With Code - latest research

PyTorch tutorials - hands-on practice

Final Thoughts:
Neural networks seemed like magic at first, but they're just clever math!
The key is understanding the fundamentals - forward prop, backprop,
and gradient descent. Everything else builds on these concepts.

"Deep learning is not a black box - it's a very complex but understandable
system of simple operations"

Notes For Deep Learning
No ratings yet
Notes For Deep Learning
6 pages
Unit 2 - ML
No ratings yet
Unit 2 - ML
18 pages
Notes DL-1
No ratings yet
Notes DL-1
10 pages
Neural Network Representation
No ratings yet
Neural Network Representation
5 pages
Deep Learning (DL) - Comprehensive Summary
No ratings yet
Deep Learning (DL) - Comprehensive Summary
9 pages
3rd Unit ML
No ratings yet
3rd Unit ML
7 pages
Gen Ai Mynotes
No ratings yet
Gen Ai Mynotes
12 pages
Deep Learning - Unit 1 Notes
No ratings yet
Deep Learning - Unit 1 Notes
27 pages
Introduction To Deep Learning With IBM PDF
No ratings yet
Introduction To Deep Learning With IBM PDF
15 pages
DLC Unit 1
No ratings yet
DLC Unit 1
7 pages
Lect 12 - Deep Feed Forward NN - Review
No ratings yet
Lect 12 - Deep Feed Forward NN - Review
93 pages
IntroductiontoDeepLearning T1753856843586
No ratings yet
IntroductiontoDeepLearning T1753856843586
3 pages
Deep Learning Fundamentals and ArchitecturesDeep Learning Fundamentals and Architectures
No ratings yet
Deep Learning Fundamentals and ArchitecturesDeep Learning Fundamentals and Architectures
9 pages
Deep Learning Concise Notes
No ratings yet
Deep Learning Concise Notes
4 pages
Unit 2 Notes NLP
No ratings yet
Unit 2 Notes NLP
6 pages
Eng PPT Tech
No ratings yet
Eng PPT Tech
18 pages
Machine Learning
No ratings yet
Machine Learning
11 pages
Deep Learning UNIT 1
No ratings yet
Deep Learning UNIT 1
22 pages
Four Unit
No ratings yet
Four Unit
3 pages
Neuron 7 AI: Linear Threshold Units
No ratings yet
Neuron 7 AI: Linear Threshold Units
18 pages
NN Fundamentals CS
No ratings yet
NN Fundamentals CS
36 pages
Neural Network-Soniya
100% (1)
Neural Network-Soniya
72 pages
Deep Learning Curriculum
No ratings yet
Deep Learning Curriculum
23 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
4 pages
Unit 1
No ratings yet
Unit 1
19 pages
MLP 1122 20240509 ch10 DeepNN
No ratings yet
MLP 1122 20240509 ch10 DeepNN
47 pages
Unit III
No ratings yet
Unit III
29 pages
Lecture 1
No ratings yet
Lecture 1
38 pages
Shortnotedeeplearning
No ratings yet
Shortnotedeeplearning
11 pages
Neural Networks: A Beginner's Guide
No ratings yet
Neural Networks: A Beginner's Guide
6 pages
SDL Unit 2 3 4
No ratings yet
SDL Unit 2 3 4
12 pages
Chapter 4 Neural Network
No ratings yet
Chapter 4 Neural Network
46 pages
Neural Networks Complete Guide
No ratings yet
Neural Networks Complete Guide
3 pages
Neural Network Explained To Beginners
No ratings yet
Neural Network Explained To Beginners
16 pages
Unit 1 Fundamentals of Deep Learning
No ratings yet
Unit 1 Fundamentals of Deep Learning
20 pages
Artificial Intelligence: Outline
No ratings yet
Artificial Intelligence: Outline
35 pages
An Introduction To Neural Networks: Instituto Tecgraf PUC-Rio Nome: Fernanda Duarte Orientador: Marcelo Gattass
No ratings yet
An Introduction To Neural Networks: Instituto Tecgraf PUC-Rio Nome: Fernanda Duarte Orientador: Marcelo Gattass
45 pages
Unit.1.Introduction To Deep Learning
No ratings yet
Unit.1.Introduction To Deep Learning
10 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
3 pages
2630 20230529 Mahdi Momen Aldawood HH 15261 946399124
No ratings yet
2630 20230529 Mahdi Momen Aldawood HH 15261 946399124
11 pages
Deep Learning Fundamentals
No ratings yet
Deep Learning Fundamentals
19 pages
ML 6
No ratings yet
ML 6
10 pages
Deep Learning & Neural Networks Guide
No ratings yet
Deep Learning & Neural Networks Guide
5 pages
Unit 1
No ratings yet
Unit 1
20 pages
Unit 1 Introduction To Neural Networks Cleaned
100% (1)
Unit 1 Introduction To Neural Networks Cleaned
4 pages
Deep Learning Updated
No ratings yet
Deep Learning Updated
11 pages
Unit 1
No ratings yet
Unit 1
29 pages
Deep Learning - Unit I
No ratings yet
Deep Learning - Unit I
16 pages
Neural Networks
No ratings yet
Neural Networks
29 pages
CP4252 ML Unit - V
No ratings yet
CP4252 ML Unit - V
17 pages
Deep Learning - Intro, Methods & Applications
100% (1)
Deep Learning - Intro, Methods & Applications
37 pages
Neural Nets Lecture 1-2 Summary
No ratings yet
Neural Nets Lecture 1-2 Summary
6 pages
Introduction To Artificial Neural Networks
No ratings yet
Introduction To Artificial Neural Networks
31 pages
Unit 1
No ratings yet
Unit 1
70 pages
Lesson 2 Neural Network Architectures
No ratings yet
Lesson 2 Neural Network Architectures
35 pages
Deep Learing
No ratings yet
Deep Learing
37 pages
Unit 3 Endsem PYQs
No ratings yet
Unit 3 Endsem PYQs
19 pages
Deep Learning
No ratings yet
Deep Learning
19 pages
Soft Sensors for VDU Quality Control
No ratings yet
Soft Sensors for VDU Quality Control
9 pages
What Is Process Geomorphology PDF
No ratings yet
What Is Process Geomorphology PDF
49 pages
CBSE Polynomial Questions Guide
No ratings yet
CBSE Polynomial Questions Guide
3 pages
Spring Balance Report 2
No ratings yet
Spring Balance Report 2
7 pages
Foundation Public School Exam Timetable
No ratings yet
Foundation Public School Exam Timetable
1 page
Axiomatic Systems Explained
No ratings yet
Axiomatic Systems Explained
4 pages
1 s2.0 S2452321623001725 Main
No ratings yet
1 s2.0 S2452321623001725 Main
8 pages
Understanding Map Data Algebra in GIS
No ratings yet
Understanding Map Data Algebra in GIS
11 pages
DLL Stat 6th Week For COT Final
100% (1)
DLL Stat 6th Week For COT Final
5 pages
(2019) (Huawei) (PAL) A Position-Bias Aware Learning Framework For CTR Prediction in Live Recommender Systems
No ratings yet
(2019) (Huawei) (PAL) A Position-Bias Aware Learning Framework For CTR Prediction in Live Recommender Systems
6 pages
Solution Outlines For Chapter 7
No ratings yet
Solution Outlines For Chapter 7
5 pages
Special Education Lesson Plan - Why Do We Need Trig PT 2
No ratings yet
Special Education Lesson Plan - Why Do We Need Trig PT 2
2 pages
Case Based MCQS: 1. Sherlin and Danju Are Playing Ludo at Home During Covid-19. While Rolling The Dice, Sherlin'S
No ratings yet
Case Based MCQS: 1. Sherlin and Danju Are Playing Ludo at Home During Covid-19. While Rolling The Dice, Sherlin'S
17 pages
Data Mining for Preterm Birth Prediction
No ratings yet
Data Mining for Preterm Birth Prediction
13 pages
2013 PMWC - Individual
No ratings yet
2013 PMWC - Individual
3 pages
Barakar Series PDF
No ratings yet
Barakar Series PDF
13 pages
ANFIS FOR Stock Price Prediction 2
No ratings yet
ANFIS FOR Stock Price Prediction 2
5 pages
Signals and Systems Formula Sheet
No ratings yet
Signals and Systems Formula Sheet
4 pages
Frequency Distributions and Graphs
100% (1)
Frequency Distributions and Graphs
49 pages
CFD Lab 4
No ratings yet
CFD Lab 4
10 pages
Maths
No ratings yet
Maths
3 pages
Year 9 Non-Linear Relationships Lesson
No ratings yet
Year 9 Non-Linear Relationships Lesson
11 pages
Collocation vs Galerkin FEM in PDE2D
No ratings yet
Collocation vs Galerkin FEM in PDE2D
2 pages
Trigonometry Basics: Right Triangle Overview
No ratings yet
Trigonometry Basics: Right Triangle Overview
14 pages
Test Planner - Prayas JEE AIR 2026
No ratings yet
Test Planner - Prayas JEE AIR 2026
5 pages
Laplace Transforms & Properties
No ratings yet
Laplace Transforms & Properties
52 pages
Adaptive Fuzzy-PI For Induction Motor Speed Control
No ratings yet
Adaptive Fuzzy-PI For Induction Motor Speed Control
5 pages
OOPS (Python) Laboratory Manual 2025 1-50 EXP
No ratings yet
OOPS (Python) Laboratory Manual 2025 1-50 EXP
54 pages
Modeling and Identification of An Industrial Gas Turbine Using Classical and Non-Classical Approaches
No ratings yet
Modeling and Identification of An Industrial Gas Turbine Using Classical and Non-Classical Approaches
6 pages
TSI-LIB-131Aslam Kassimali Matrix Analysis of Structure-Alvarez1
No ratings yet
TSI-LIB-131Aslam Kassimali Matrix Analysis of Structure-Alvarez1
25 pages