0% found this document useful (0 votes)

104 views14 pages

Understanding Neural Networks Basics

The document provides an overview of neural networks and some of their key concepts. It discusses the layer as the basic building block, with an initial linear combination followed by a non-linear activation function. It defines deep neural networks as having multiple hidden layers between the input and output layers. Non-linear activation functions are needed to allow stacking of layers. Common activation functions discussed include sigmoid, TanH, ReLU, and softmax. Backpropagation is introduced as an algorithm that calculates the contribution of each parameter to errors and backpropagates errors through the network to update weights and biases.

Uploaded by

Badrinath SVN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

104 views14 pages

Understanding Neural Networks Basics

Uploaded by

Badrinath SVN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Iliya Valchanov

Neural Networks
Overview
365 DATA SCENCE 2

Table of Content:

Abstract .....................................................................................................................................3
1. The Layer ..............................................................................................................................4
2. What is a Deep Net? ...........................................................................................................5
3. Why Do we Need Non-Linearities to Stack Layers? ............................................................7
4. Activation Functions.............................................................................................................8
4.1 Common Activation Functions ................................................................................9
4.2 Softmax Activation ...............................................................................................10
5. Backpropagation...................................................................................................................11
5.1 Backpropagation formula ......................................................................................12
365 DATA SCENCE 3

Abstract

In these course notes, we are going to cover the advanced machine learning
algorithm known as deep learning which is capable of creating highly accurate
predictive models without having to give it any explicit instructions . It accomplishes
that by working with large amounts of unstructured data, and mimics the human
learning process by building a hierarchy of complex abstractions.

Keywords: deep net, backpropagation formula, Softmax activation, layers

365 DATA SCENCE 4

1. The Layer

An initial linear combination and the The layer is the building block of
added non-linearity form a neural networks

Minimal example (a Neural

simple neural network) networks

Input layer
Input layer
x1 Non- linearity
x1 Output

y xw+b ∫ y
x2 Linear
x2 Output layer
combination

In the minimal example we trained a Neural networks step on linear

neural network which had no depth. combinations, but add a non- linearity
There were solely an input layer and to each one of them. Mixing linear
an output layer. Moreover, the output combinations and non-linearities
was simply a linear combination of allows us to model arbitrary functions.
the input.
365 DATA SCENCE 5

2. What is a Deep Net?

This is a deep neural network (deep net) with 5 layers.

How to read this diagram:

A layer A unit (a neuron) Arrows represent

mathematical
transformations

Hidden Hidden Hidden

layer 1 layer 2 layer 3
Input layer

output layer
365 DATA SCENCE 6

The width of a layer is the number of units in that layer

The width of the net is the number of units of the biggest layer

The depth of the net is equal to the number of layers or the number of hidden layers.
The term has different definitions. More often than not, we are interested in the
number of hidden layers (as there are always input and output layers).

The width and the depth of the net are called hyperparameters. They are values
we manually chose when creating the net.

Hidden Hidden Hidden

layer 1 layer 2 layer 3
Input layer

Output layer

Width

Depth
365 DATA SCENCE 7

3. Why do we need non-linearities to stack layers?

You can see a net with no non-linearities: just linear combinations.

Hidden
layer 1
Input
layer
h = x w1
Output
layer
y = h * w2

y = x * w1 * w2
8x9 9x4

y=x*w
8x4

Two consecutive linear

transformations are equivalent to a
* single one.

Input
layer Output
layer

1x8 8x4 1x4

365 DATA SCENCE 8

4. Activation Functions
Input
Activation functions
x1

xw+b ∫
x2 Linear
combination

In the respective lesson, we gave an example of temperature change. The temperature

starts decreasing (which is a numerical change). Our brain is a kind of an ‘activation
function’. It tells us whether it is cold enough for us to put on a jacket.

Putting on a jacket is a binary action: 0 (no jacket) or 1 ( jacket).

This is a very intuitive and visual (yet not so practical) example of how activation
functions work.

Activation functions (non-linearities) are needed so we can break the linearity and
represent more complicated relationships

Moreover, activation functions are required in order to stack layers.

Activation functions transform inputs into outputs of a different kind.

365 DATA SCENCE 9

4.1 Common Activation Functions

Name Formula Derivative Graph Range

sigmoid (logistic (0,1)

function)

TanH
(hyperbolic (-1,1)
tangent)

ReLu (rectified (0,∞)

linear unit)

softmax (0,1)

All common activation functions are: monotonic, continuous, and differentiable. These
are important properties needed for the optimization.
365 DATA SCENCE 10

4.2 Softmax Activation

Hidden layer Output layer

ah = hw+b

The softmax activation transforms a bunch of arbitrarily large or small numbers into a
valid probability distribution.

While other activation functions get an input value and transform it, regardless of
the other elements, the softmax considers the information about the whole set of
numbers we have.

The values that softmax outputs are in the range from 0 to 1 and their sum is exactly 1
(like probabilities).

Example:

probability distribution

The property of the softmax to output probabilities is so useful and intuitive that it is
often used as the activation function for the final (output) layer.

However, when the softmax is used prior to that (as the activation of a hidden layer),
the results are not as satisfactory. That’s because a lot of the information about the
variability of the data is lost.
365 DATA SCENCE 11

5. Backpropagation

Forward propagation is the process of pushing inputs through the net. At the end of
each epoch, the obtained outputs are compared to targets to form the errors

Backpropagation of errors is an algorithm for neural networks using gradient

descent. It consists of calculating the contribution of each parameter to the errors. We
backpropagate the errors through the net and update the parameters (weights and
biases) accordingly.
365 DATA SCENCE 12

5.1 Backpropagation Formula

where

If you want to examine the full derivation, please make use of the PDF
we made available in the section: Backpropagation. A peek into the
Mathematics of Optimization.

If you found this resource useful, check out our e-learning program. We have
everything you need to succeed in data science.

Learn the most sought-after data science skills from the best experts in the field!
Earn a verifiable certificate of achievement trusted by employers worldwide and
future proof your career.

Comprehensive training, exams, certificates.

 162 hours of video  Exams & Certification  Portfolio advice

 599+ Exercises  Personalized support  New content
 Downloadables  Resume Builder & Feedback  Career tracks

Join a global community of 1.8 M successful students with an annual subscription

at 60% OFF with coupon code 365RESOURCES.

$432 $172.80/year

Start at 60% Off

Iliya Valchanov

Email: [email protected]

Unit 2 Deep Learning and Neural Networks
No ratings yet
Unit 2 Deep Learning and Neural Networks
38 pages
Notes On Introduction To Deep Learning
No ratings yet
Notes On Introduction To Deep Learning
19 pages
Unit 4
No ratings yet
Unit 4
19 pages
Neural Networks: A Deep Dive
No ratings yet
Neural Networks: A Deep Dive
34 pages
Ad3451 ML Unit 4 Notes Eduengg
No ratings yet
Ad3451 ML Unit 4 Notes Eduengg
36 pages
Unit Ii DNN
No ratings yet
Unit Ii DNN
24 pages
DL M2 Tech
No ratings yet
DL M2 Tech
32 pages
Activation Funtions
No ratings yet
Activation Funtions
26 pages
Ad3451 ML Unit 4 Notes
No ratings yet
Ad3451 ML Unit 4 Notes
34 pages
MLS 1 - Presentation
No ratings yet
MLS 1 - Presentation
11 pages
cst414 - Deep Learning
No ratings yet
cst414 - Deep Learning
34 pages
Deep Learning Interview
No ratings yet
Deep Learning Interview
28 pages
Neural Networks: Key Concepts & Functions
No ratings yet
Neural Networks: Key Concepts & Functions
22 pages
7 Types of Neural Network Activation Functions
No ratings yet
7 Types of Neural Network Activation Functions
16 pages
LLM Ai Interview SS
No ratings yet
LLM Ai Interview SS
187 pages
Ad3451 ML Unit 4 Notes
No ratings yet
Ad3451 ML Unit 4 Notes
36 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
Deep Learning Tutorial 9
No ratings yet
Deep Learning Tutorial 9
70 pages
Unit I
No ratings yet
Unit I
90 pages
Ca 3 DL
No ratings yet
Ca 3 DL
6 pages
Deep Learning
No ratings yet
Deep Learning
19 pages
Soft - Computing - 2 With Numericals
No ratings yet
Soft - Computing - 2 With Numericals
64 pages
Deep Learning Interview Questions Guide
No ratings yet
Deep Learning Interview Questions Guide
46 pages
Deep Learning
No ratings yet
Deep Learning
37 pages
Activation Function in NN
No ratings yet
Activation Function in NN
29 pages
NNDL Umit 1 Important Questions
No ratings yet
NNDL Umit 1 Important Questions
8 pages
Complet Deep Learinig Interview Question Live Class
No ratings yet
Complet Deep Learinig Interview Question Live Class
46 pages
NN Unit - 1
100% (1)
NN Unit - 1
27 pages
Ann MJJ-1
No ratings yet
Ann MJJ-1
64 pages
06 AIS302 ANN Backpropagation
No ratings yet
06 AIS302 ANN Backpropagation
83 pages
Unit II
No ratings yet
Unit II
12 pages
Forward and Backward Propagation Deep Learning 1703697260
No ratings yet
Forward and Backward Propagation Deep Learning 1703697260
9 pages
UNIT-III Activation-Function
No ratings yet
UNIT-III Activation-Function
6 pages
Mod 4 Notes
No ratings yet
Mod 4 Notes
46 pages
Chapter 5 Final
No ratings yet
Chapter 5 Final
80 pages
2K21 - Ee - 192 MLP
No ratings yet
2K21 - Ee - 192 MLP
59 pages
DEEP LEARNING Paper
No ratings yet
DEEP LEARNING Paper
12 pages
Working of Multi-Layer Perceptron
No ratings yet
Working of Multi-Layer Perceptron
16 pages
Activation Functions in Neural Networks
No ratings yet
Activation Functions in Neural Networks
7 pages
DL Answers
No ratings yet
DL Answers
24 pages
Deep Learning Introduction
No ratings yet
Deep Learning Introduction
44 pages
ML Lect6n7 NN Architecture and Training
No ratings yet
ML Lect6n7 NN Architecture and Training
122 pages
Lec 22 Activations Functions Complete
No ratings yet
Lec 22 Activations Functions Complete
33 pages
Deep Learning & Activation Functions
No ratings yet
Deep Learning & Activation Functions
32 pages
Introduction Deep Eng
No ratings yet
Introduction Deep Eng
50 pages
Neural Networks and Backpropagation Explained
No ratings yet
Neural Networks and Backpropagation Explained
15 pages
DL Answers
No ratings yet
DL Answers
11 pages
Understanding Perceptron in Machine Learning
No ratings yet
Understanding Perceptron in Machine Learning
11 pages
Module 02
No ratings yet
Module 02
20 pages
Lecture 9-NN - Modified
No ratings yet
Lecture 9-NN - Modified
94 pages
Activatn FN 2
No ratings yet
Activatn FN 2
10 pages
19EEE362:Deep Learning For Visual Computing: Dr.T.Ananthan
No ratings yet
19EEE362:Deep Learning For Visual Computing: Dr.T.Ananthan
23 pages
Fundamentals of Neural Network
No ratings yet
Fundamentals of Neural Network
84 pages
Unit 2
No ratings yet
Unit 2
35 pages
Neural Networks
No ratings yet
Neural Networks
61 pages
Session NN
No ratings yet
Session NN
32 pages
ML MU Unit 5NeuralNetworkpdf 2025 04 16 13 47 39
No ratings yet
ML MU Unit 5NeuralNetworkpdf 2025 04 16 13 47 39
57 pages
ICP Unit-1 Updated V6
No ratings yet
ICP Unit-1 Updated V6
30 pages
Assignmnt 3
No ratings yet
Assignmnt 3
18 pages
Visible Surface Detection Algorithms
No ratings yet
Visible Surface Detection Algorithms
21 pages
Digital 2
No ratings yet
Digital 2
21 pages
02 1 Cohen Sutherland PDF
No ratings yet
02 1 Cohen Sutherland PDF
3 pages
6.9 and 6.7 A2 Curve Fitting With Polynomial Models and End Behaviour
No ratings yet
6.9 and 6.7 A2 Curve Fitting With Polynomial Models and End Behaviour
10 pages
Gauss Elimination Examples & Solutions
No ratings yet
Gauss Elimination Examples & Solutions
8 pages
Fast RNN Clustering Algorithm
No ratings yet
Fast RNN Clustering Algorithm
5 pages
Bhargava2 v-4 ScrambledConteer-11
No ratings yet
Bhargava2 v-4 ScrambledConteer-11
10 pages
Binomial Theorem Solutions and Examples
No ratings yet
Binomial Theorem Solutions and Examples
5 pages
Advanced Data Structures and Algorithms
No ratings yet
Advanced Data Structures and Algorithms
144 pages
Two Sigma - LeetCode
No ratings yet
Two Sigma - LeetCode
2 pages
RandONets Shallow Networks With Random Projections F 2025 Journal of Comput
No ratings yet
RandONets Shallow Networks With Random Projections F 2025 Journal of Comput
22 pages
Feature Selection - New
No ratings yet
Feature Selection - New
41 pages
Control Systems MATLAB File
No ratings yet
Control Systems MATLAB File
25 pages
DS Presentation
No ratings yet
DS Presentation
20 pages
Rat in A Maze
No ratings yet
Rat in A Maze
9 pages
Intelligence-Based Medicine: Rutuja Shinde
No ratings yet
Intelligence-Based Medicine: Rutuja Shinde
15 pages
The Basics of Anti-Aliasing - Using Switched-Capacitor Filters (Maxim Integrated Tutorials-Switched Cap)
No ratings yet
The Basics of Anti-Aliasing - Using Switched-Capacitor Filters (Maxim Integrated Tutorials-Switched Cap)
5 pages
Striver's SDE Sheet - Top Coding Interview Problems
No ratings yet
Striver's SDE Sheet - Top Coding Interview Problems
26 pages
IX Polynomials
No ratings yet
IX Polynomials
5 pages
Algorithm U1 Answer Key
No ratings yet
Algorithm U1 Answer Key
23 pages
The Hexagonal Fast Fourier Transform: James B. Birdsong Nicholas I. Rummelt
No ratings yet
The Hexagonal Fast Fourier Transform: James B. Birdsong Nicholas I. Rummelt
4 pages
OpenSSL Hashing and Verification Lab
No ratings yet
OpenSSL Hashing and Verification Lab
3 pages
FAANG Preparation Problems
No ratings yet
FAANG Preparation Problems
10 pages
LES Turbulent Flow Filtering Techniques
No ratings yet
LES Turbulent Flow Filtering Techniques
17 pages
MCS-208 Solved Assignment 2025-26
No ratings yet
MCS-208 Solved Assignment 2025-26
3 pages
Computer Vision Exam Paper - GTU 2023
No ratings yet
Computer Vision Exam Paper - GTU 2023
1 page
Simultaneous Optimization of Renewable Energy and Energy Storage Capacity With The Hierarchical Control
No ratings yet
Simultaneous Optimization of Renewable Energy and Energy Storage Capacity With The Hierarchical Control
10 pages
NN (Ass 4)
No ratings yet
NN (Ass 4)
16 pages

Understanding Neural Networks Basics

Uploaded by

Understanding Neural Networks Basics

Uploaded by

Iliya Valchanov

Keywords: deep net, backpropagation formula, Softmax activation, layers

Minimal example (a Neural

In the minimal example we trained a Neural networks step on linear

2. What is a Deep Net?

This is a deep neural network (deep net) with 5 layers.

How to read this diagram:

A layer A unit (a neuron) Arrows represent

Hidden Hidden Hidden

The width of a layer is the number of units in that layer

Hidden Hidden Hidden

3. Why do we need non-linearities to stack layers?

You can see a net with no non-linearities: just linear combinations.

Two consecutive linear

1x8 8x4 1x4

In the respective lesson, we gave an example of temperature change. The temperature

Putting on a jacket is a binary action: 0 (no jacket) or 1 ( jacket).

Moreover, activation functions are required in order to stack layers.

Activation functions transform inputs into outputs of a different kind.

4.1 Common Activation Functions

Name Formula Derivative Graph Range

sigmoid (logistic (0,1)

ReLu (rectified (0,∞)

4.2 Softmax Activation

Hidden layer Output layer

Backpropagation of errors is an algorithm for neural networks using gradient

5.1 Backpropagation Formula

Comprehensive training, exams, certificates.

 162 hours of video  Exams & Certification  Portfolio advice

Join a global community of 1.8 M successful students with an annual subscription

Start at 60% Off

You might also like