0% found this document useful (0 votes)

41 views28 pages

Lecture 2 - GD Linear Regression

The document discusses various activation functions used in deep learning neural networks. It defines eight different activation functions - sigmoid, tanh, ReLU, Leaky ReLU, Parametric ReLU, Exponential Linear Unit (ELU), softplus, and softmax. For each activation function, it provides the mathematical equation. It also includes questions about why activation functions are needed and how they should be defined for a layer or neuron.

Uploaded by

Robert Babayan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views28 pages

Lecture 2 - GD Linear Regression

Uploaded by

Robert Babayan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Deep Learning

Vazgen Mikayelyan

October 20, 2020

V. Mikayelyan Deep Learning October 20, 2020 1 / 15

Activation functions
1
1. Sigmoid: σ (x) =
1 + e −x

V. Mikayelyan Deep Learning October 20, 2020 2 / 15

Activation functions
1
1. Sigmoid: σ (x) =
1 + e −x

V. Mikayelyan Deep Learning October 20, 2020 2 / 15

Activation functions

e x − e −x
2. Tanh: tanh (x) =
e x + e −x

V. Mikayelyan Deep Learning October 20, 2020 3 / 15

Activation functions

e x − e −x
2. Tanh: tanh (x) =
e x + e −x

V. Mikayelyan Deep Learning October 20, 2020 3 / 15

Activation functions

3. Rectified linear unit: ReLU (x) = max (0, x)

V. Mikayelyan Deep Learning October 20, 2020 4 / 15

Activation functions

3. Rectified linear unit: ReLU (x) = max (0, x)

V. Mikayelyan Deep Learning October 20, 2020 4 / 15

Activation functions
(
0.01x, for x < 0
4. Leaky ReLU: LR (x) =
x, for x ≥ 0

V. Mikayelyan Deep Learning October 20, 2020 5 / 15

Activation functions
(
0.01x, for x < 0
4. Leaky ReLU: LR (x) =
x, for x ≥ 0

V. Mikayelyan Deep Learning October 20, 2020 5 / 15

Activation functions
(
ax, for x < 0
5. Parametric ReLU: PR (x) =
x, for x ≥ 0

V. Mikayelyan Deep Learning October 20, 2020 6 / 15

Activation functions
(
ax, for x < 0
5. Parametric ReLU: PR (x) =
x, for x ≥ 0

V. Mikayelyan Deep Learning October 20, 2020 6 / 15

Activation functions
(
a (e x − 1) , for x < 0
6. Exponential linear unit: ELU (x) =
x, for x ≥ 0

V. Mikayelyan Deep Learning October 20, 2020 7 / 15

Activation functions
(
a (e x − 1) , for x < 0
6. Exponential linear unit: ELU (x) =
x, for x ≥ 0

V. Mikayelyan Deep Learning October 20, 2020 7 / 15

Activation functions
7. SoftPlus: SP (x) = log (1 + e x )

V. Mikayelyan Deep Learning October 20, 2020 8 / 15

Activation functions
7. SoftPlus: SP (x) = log (1 + e x )

V. Mikayelyan Deep Learning October 20, 2020 8 / 15

Activation functions

 
 e x1 e x2 e xn 
8. Softmax: S (x1 , x2 , . . . , xn ) = 
Pn , n , . . . , n


x x x
P P
e i e i e i
i=1 i=1 i=1

V. Mikayelyan Deep Learning October 20, 2020 9 / 15

Questions

1 Why do we need activation functions?

V. Mikayelyan Deep Learning October 20, 2020 10 / 15

Questions

1 Why do we need activation functions?

2 How should we define activation functions, for a layer or for a neuron?

V. Mikayelyan Deep Learning October 20, 2020 10 / 15

Outline

1 Gradient Descent

2 Linear and Logistic Regressions

V. Mikayelyan Deep Learning October 20, 2020 11 / 15

Gradient Descent

Let f : Rk → R be a convex function and we want to find its global

minimum.

V. Mikayelyan Deep Learning October 20, 2020 12 / 15

Gradient Descent

Let f : Rk → R be a convex function and we want to find its global

minimum. This optimization algorithm is based on the fact that the fastest
decreasing direction of the function is the opposite direction of gradient:

xn+1 = xn − α∇f (xn )

and x0 ∈ Rk is a arbitrary point.

V. Mikayelyan Deep Learning October 20, 2020 12 / 15

Outline

1 Gradient Descent

2 Linear and Logistic Regressions

V. Mikayelyan Deep Learning October 20, 2020 13 / 15

Linear Regression

Let (xi , yi )ni=1 , xi ∈ Rk , yi ∈ R be our training data.

V. Mikayelyan Deep Learning October 20, 2020 14 / 15

Linear Regression

Let (xi , yi )ni=1 , xi ∈ Rk , yi ∈ R be our training data. Consider the function

f (x) = f x 1 , x 2 , . . . , x k = w 1 x 1 + w 2 x 2 + . . . + w k x k + b = w T x + b.

V. Mikayelyan Deep Learning October 20, 2020 14 / 15

Linear Regression

Let (xi , yi )ni=1 , xi ∈ Rk , yi ∈ R be our training data. Consider the function

f (x) = f x 1 , x 2 , . . . , x k = w 1 x 1 + w 2 x 2 + . . . + w k x k + b = w T x + b.

Our aim is to find parameters b, w 1 , w 2 , . . . , w k such that

f (xi ) ≈ yi , i = 1, . . . , n.

V. Mikayelyan Deep Learning October 20, 2020 14 / 15

Linear Regression

Let (xi , yi )ni=1 , xi ∈ Rk , yi ∈ R be our training data. Consider the function

f (x) = f x 1 , x 2 , . . . , x k = w 1 x 1 + w 2 x 2 + . . . + w k x k + b = w T x + b.

Our aim is to find parameters b, w 1 , w 2 , . . . , w k such that

f (xi ) ≈ yi , i = 1, . . . , n.

We choose L2 distance as our loss function:

n
1X
(f (xl ) − yl )2 .
n
l=1

V. Mikayelyan Deep Learning October 20, 2020 14 / 15

Questions

1 Should we minimize the loss function using gradient descent?

V. Mikayelyan Deep Learning October 20, 2020 15 / 15

Questions

1 Should we minimize the loss function using gradient descent?

2 Can you represent this model as a neural network?

V. Mikayelyan Deep Learning October 20, 2020 15 / 15

Lecture 3 - Logistic Regression, Regularization, Softmax Classifier
No ratings yet
Lecture 3 - Logistic Regression, Regularization, Softmax Classifier
26 pages
Activation Function: Deep Neural Networks
No ratings yet
Activation Function: Deep Neural Networks
47 pages
Lecture 18. Backpropagation
No ratings yet
Lecture 18. Backpropagation
55 pages
Non Linear NN
No ratings yet
Non Linear NN
20 pages
Deep Learning: International Islamic University of Chittagong
No ratings yet
Deep Learning: International Islamic University of Chittagong
31 pages
9 DL - ANN - ActivationFunctions
No ratings yet
9 DL - ANN - ActivationFunctions
20 pages
Feedforward Neural Network Overview
No ratings yet
Feedforward Neural Network Overview
35 pages
Understanding Perceptron in Machine Learning
No ratings yet
Understanding Perceptron in Machine Learning
11 pages
Activation Functions
No ratings yet
Activation Functions
5 pages
Activation Function in NN
No ratings yet
Activation Function in NN
29 pages
4-Neural Networks and Activation Function
No ratings yet
4-Neural Networks and Activation Function
28 pages
Instructor's Solution Manual For Neural Networks
0% (1)
Instructor's Solution Manual For Neural Networks
40 pages
Instructor Solution Manual To Neural Networks and Deep Learning A Textbook Solutions 3319944622 9783319944623 - Compress
No ratings yet
Instructor Solution Manual To Neural Networks and Deep Learning A Textbook Solutions 3319944622 9783319944623 - Compress
40 pages
Intro to Machine Learning Basics
No ratings yet
Intro to Machine Learning Basics
61 pages
1) Deep - Learning
No ratings yet
1) Deep - Learning
60 pages
Arjun Yadav 32, Activation Function Assignment
No ratings yet
Arjun Yadav 32, Activation Function Assignment
7 pages
DL Lab 02
No ratings yet
DL Lab 02
12 pages
DL Module II Till7thAug
No ratings yet
DL Module II Till7thAug
131 pages
Differentiability of Activation and Loss Functions
No ratings yet
Differentiability of Activation and Loss Functions
35 pages
Activation
No ratings yet
Activation
7 pages
Winter1516 Lecture53
No ratings yet
Winter1516 Lecture53
20 pages
UNIT-III Activation-Function
No ratings yet
UNIT-III Activation-Function
6 pages
Activation Functions in Neural Networks
No ratings yet
Activation Functions in Neural Networks
7 pages
Lec 22 Activations Functions Complete
No ratings yet
Lec 22 Activations Functions Complete
33 pages
Deep Learning for Engineers
No ratings yet
Deep Learning for Engineers
141 pages
Performance Analysis of Various Activation Functio
No ratings yet
Performance Analysis of Various Activation Functio
7 pages
Unit 3 Deep Learning
No ratings yet
Unit 3 Deep Learning
11 pages
Dl-Module 2
No ratings yet
Dl-Module 2
138 pages
CS2011 5
No ratings yet
CS2011 5
43 pages
Activation Functions in Neural Networks
No ratings yet
Activation Functions in Neural Networks
33 pages
Lecture 4 - SGD, Back Propagation
No ratings yet
Lecture 4 - SGD, Back Propagation
25 pages
Lecture NN Part1
No ratings yet
Lecture NN Part1
62 pages
Neural Network Activation Guide
No ratings yet
Neural Network Activation Guide
43 pages
DL Module 2
No ratings yet
DL Module 2
148 pages
Unit 2
No ratings yet
Unit 2
35 pages
Ad3451 ML Unit 4 Notes
No ratings yet
Ad3451 ML Unit 4 Notes
34 pages
Activation Functions in Neural Networks
No ratings yet
Activation Functions in Neural Networks
7 pages
M.Tech Deep Learning Exam Guide
100% (1)
M.Tech Deep Learning Exam Guide
6 pages
Deep Learning Tutorial 3
No ratings yet
Deep Learning Tutorial 3
12 pages
Winter1516 Lecture52
No ratings yet
Winter1516 Lecture52
20 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
NN Unit - 1
100% (1)
NN Unit - 1
27 pages
16 - The Key To The Most Powerful ML Models
No ratings yet
16 - The Key To The Most Powerful ML Models
25 pages
Fundamentals of Neural Network
No ratings yet
Fundamentals of Neural Network
84 pages
Lecture 9-NN - Modified
No ratings yet
Lecture 9-NN - Modified
94 pages
Activation Functions
No ratings yet
Activation Functions
4 pages
003 Activation Functions in Machine Learning
No ratings yet
003 Activation Functions in Machine Learning
19 pages
08 Practical Aspects of Deep Learning 2
No ratings yet
08 Practical Aspects of Deep Learning 2
100 pages
Cours 2
No ratings yet
Cours 2
25 pages
Activatn FN 2
No ratings yet
Activatn FN 2
10 pages
DL Answers
No ratings yet
DL Answers
24 pages
Deep Learning Lectures - 2
No ratings yet
Deep Learning Lectures - 2
73 pages
Machine Learning (CSO851) - Lecture 08
No ratings yet
Machine Learning (CSO851) - Lecture 08
27 pages
Activation Functions - Ipynb - Colaboratory
No ratings yet
Activation Functions - Ipynb - Colaboratory
10 pages
Lecture 2.1.2activation Function
No ratings yet
Lecture 2.1.2activation Function
15 pages
T243 COE 292 Quiz04 Concept
No ratings yet
T243 COE 292 Quiz04 Concept
7 pages
Derivatives of Activation Functions
No ratings yet
Derivatives of Activation Functions
89 pages
1160 CS F425 20241218114944 Comprehensive Exam Question Paper
No ratings yet
1160 CS F425 20241218114944 Comprehensive Exam Question Paper
5 pages
Penentuan Parameter Learning Rate Selama Pembelajaran Jaringan Syaraf Tiruan Genetika
No ratings yet
Penentuan Parameter Learning Rate Selama Pembelajaran Jaringan Syaraf Tiruan Genetika
11 pages
ML Exam Prep
No ratings yet
ML Exam Prep
14 pages
Lesson 1 Introduction To Object Oriented Programming
No ratings yet
Lesson 1 Introduction To Object Oriented Programming
17 pages
Formal Languages and Automata Theory PDF
No ratings yet
Formal Languages and Automata Theory PDF
6 pages
Dit 0305 Ooad Notes
100% (1)
Dit 0305 Ooad Notes
30 pages
Theoretical Benefit and Limitation of Diffusion Language Model
No ratings yet
Theoretical Benefit and Limitation of Diffusion Language Model
32 pages
Artificial Neural Network Assignment
No ratings yet
Artificial Neural Network Assignment
4 pages
Digital Electronics
No ratings yet
Digital Electronics
8 pages
Generative Time Series Forecasting With Diffusion, Denoise, and Disentanglement
No ratings yet
Generative Time Series Forecasting With Diffusion, Denoise, and Disentanglement
29 pages
Exploding Gradient Problem and Vanishing Gradient Problem
100% (1)
Exploding Gradient Problem and Vanishing Gradient Problem
14 pages
Properties of Context-Free Languages
No ratings yet
Properties of Context-Free Languages
35 pages
Statistics and Econometrics II Problem Set 1: Exercise 1 Law of Iterated Expectations
No ratings yet
Statistics and Econometrics II Problem Set 1: Exercise 1 Law of Iterated Expectations
7 pages
Previous Year Question Papers on Automata
No ratings yet
Previous Year Question Papers on Automata
4 pages
Materi-IF440-M14-Overview Deep Learning-Gsl2025-2026
No ratings yet
Materi-IF440-M14-Overview Deep Learning-Gsl2025-2026
92 pages
Regular Expressions & Finite Automata
No ratings yet
Regular Expressions & Finite Automata
4 pages
3.soft Chain-Of-Thought For Efficient Reasoning With LLMs
No ratings yet
3.soft Chain-Of-Thought For Efficient Reasoning With LLMs
16 pages
LLM Ntoes
No ratings yet
LLM Ntoes
1,139 pages
Finite State Machines
No ratings yet
Finite State Machines
16 pages
Gradient Descent Algorithm and Back-Propagation Derivation
No ratings yet
Gradient Descent Algorithm and Back-Propagation Derivation
4 pages
DFA and NFA Complement Closure Proofs
No ratings yet
DFA and NFA Complement Closure Proofs
2 pages
Deep Learning Chorale Prelude
No ratings yet
Deep Learning Chorale Prelude
6 pages
Binomial Probability Distribution Table
No ratings yet
Binomial Probability Distribution Table
20 pages
Deep Learning: - Course Code: - Unit 1
No ratings yet
Deep Learning: - Course Code: - Unit 1
21 pages
Introduction to Random Variables
No ratings yet
Introduction to Random Variables
50 pages
Enhancing Neural Network Models For MNIST Digit Recognition
No ratings yet
Enhancing Neural Network Models For MNIST Digit Recognition
6 pages
Automata Chapter 2 Exercises
No ratings yet
Automata Chapter 2 Exercises
66 pages
Minimize DFA: Steps & Examples
No ratings yet
Minimize DFA: Steps & Examples
13 pages
Advanced Density Estimation
No ratings yet
Advanced Density Estimation
20 pages
30 Encoder, Decoder, Sequence To Sequence 25-09-2024
No ratings yet
30 Encoder, Decoder, Sequence To Sequence 25-09-2024
5 pages
Continuous Probability Distributions Guide
100% (32)
Continuous Probability Distributions Guide
22 pages

Lecture 2 - GD Linear Regression

Uploaded by

Lecture 2 - GD Linear Regression

Uploaded by

Deep Learning

October 20, 2020

V. Mikayelyan Deep Learning October 20, 2020 1 / 15

V. Mikayelyan Deep Learning October 20, 2020 2 / 15

V. Mikayelyan Deep Learning October 20, 2020 2 / 15

V. Mikayelyan Deep Learning October 20, 2020 3 / 15

V. Mikayelyan Deep Learning October 20, 2020 3 / 15

3. Rectified linear unit: ReLU (x) = max (0, x)

V. Mikayelyan Deep Learning October 20, 2020 4 / 15

3. Rectified linear unit: ReLU (x) = max (0, x)

V. Mikayelyan Deep Learning October 20, 2020 4 / 15

V. Mikayelyan Deep Learning October 20, 2020 5 / 15

V. Mikayelyan Deep Learning October 20, 2020 5 / 15

V. Mikayelyan Deep Learning October 20, 2020 6 / 15

V. Mikayelyan Deep Learning October 20, 2020 6 / 15

V. Mikayelyan Deep Learning October 20, 2020 7 / 15

V. Mikayelyan Deep Learning October 20, 2020 7 / 15

V. Mikayelyan Deep Learning October 20, 2020 8 / 15

V. Mikayelyan Deep Learning October 20, 2020 8 / 15

V. Mikayelyan Deep Learning October 20, 2020 9 / 15

1 Why do we need activation functions?

V. Mikayelyan Deep Learning October 20, 2020 10 / 15

1 Why do we need activation functions?

V. Mikayelyan Deep Learning October 20, 2020 10 / 15

2 Linear and Logistic Regressions

V. Mikayelyan Deep Learning October 20, 2020 11 / 15

Let f : Rk → R be a convex function and we want to find its global

V. Mikayelyan Deep Learning October 20, 2020 12 / 15

Let f : Rk → R be a convex function and we want to find its global

xn+1 = xn − α∇f (xn )

and x0 ∈ Rk is a arbitrary point.

V. Mikayelyan Deep Learning October 20, 2020 12 / 15

2 Linear and Logistic Regressions

V. Mikayelyan Deep Learning October 20, 2020 13 / 15

Let (xi , yi )ni=1 , xi ∈ Rk , yi ∈ R be our training data.

V. Mikayelyan Deep Learning October 20, 2020 14 / 15

Let (xi , yi )ni=1 , xi ∈ Rk , yi ∈ R be our training data. Consider the function

V. Mikayelyan Deep Learning October 20, 2020 14 / 15

Let (xi , yi )ni=1 , xi ∈ Rk , yi ∈ R be our training data. Consider the function

Our aim is to find parameters b, w 1 , w 2 , . . . , w k such that

V. Mikayelyan Deep Learning October 20, 2020 14 / 15

Let (xi , yi )ni=1 , xi ∈ Rk , yi ∈ R be our training data. Consider the function

Our aim is to find parameters b, w 1 , w 2 , . . . , w k such that

We choose L2 distance as our loss function:

V. Mikayelyan Deep Learning October 20, 2020 14 / 15

1 Should we minimize the loss function using gradient descent?

V. Mikayelyan Deep Learning October 20, 2020 15 / 15

1 Should we minimize the loss function using gradient descent?

V. Mikayelyan Deep Learning October 20, 2020 15 / 15

You might also like