0% found this document useful (0 votes)
29 views10 pages

Computational Graphs

The document discusses the concept of backpropagation and its significance in training deep learning models, emphasizing its role as a computational tool in various fields. It explains computational graphs as a method for visualizing mathematical expressions and outlines how derivatives are calculated using these graphs. The document also contrasts forward-mode and reverse-mode differentiation, highlighting their respective approaches to tracking input-output relationships in computations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views10 pages

Computational Graphs

The document discusses the concept of backpropagation and its significance in training deep learning models, emphasizing its role as a computational tool in various fields. It explains computational graphs as a method for visualizing mathematical expressions and outlines how derivatives are calculated using these graphs. The document also contrasts forward-mode and reverse-mode differentiation, highlighting their respective approaches to tracking input-output relationships in computations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

CS-671: Deep Learning and its Applications

Lecture: 06
Computational Graphs

Aditya Nigam, Assistant Professor


School of Computing and Electrical Engineering (SCEE)
Indian Institute of Technology, Mandi
http://faculty.iitmandi.ac.in/ãditya/ [email protected]

Presentation for CS-671@IIT Mandi (6 March, 2019)


(*Slides Credit : Calculus on Computational Graphs: Backpropagation, colah’s blog)
http://colah.github.io/posts/2015-08-Backprop/

February - May, 2019


Calculus on Computational Graphs: Backpropagation

Introduction
Backpropagation is the key algorithm that makes training deep
models computationally tractable.
Beyond its use in deep learning, backpropagation is a powerful
computational tool in many other areas, ranging from weather
forecasting to analyzing numerical stability it just goes by different
names.
The general, application independent, name is reverse-mode
differentiation.
Fundamentally, its a technique for calculating derivatives quickly.

Aditya Nigam (SCEE, IIT-Mandi) [email protected] Lecture, February - May, 2019


Calculus on Computational Graphs: Backpropagation
Computational Graphs
Computational graphs are a nice way to think about mathematical
expressions.
For example, consider the expression e=(a+b)(b+1). There are three
operations: two additions and one multiplication.
To create a computational graph, we make each of these operations,
along with the input variables, into nodes.
When one nodes value is the input to another node, an arrow goes
from one to another.

Aditya Nigam (SCEE, IIT-Mandi) [email protected] Lecture, February - May, 2019


Calculus on Computational Graphs: Backpropagation
Derivatives on Computational Graphs
The key is to understand derivatives on the edges.
If a directly affects c, then we want to know how it affects c.
If a changes a little bit, how does c change? We call this the partial
derivative of c with respect to a.
To evaluate the partial derivatives in graph, we need the sum rule and
the product rule:

Aditya Nigam (SCEE, IIT-Mandi) [email protected] Lecture, February - May, 2019


Calculus on Computational Graphs: Backpropagation

Derivatives on Computational Graphs


What if we want to understand how nodes that arent directly
connected affect each other?
Lets consider how e is affected by a. If we change a at a speed of 1, c
also changes at a speed of 1.
In turn, c changing at a speed of 1 causes e to change at a speed of
2. So e changes at a rate of 1 × 2 with respect to a.
The general rule is to sum over all possible paths from one node to the
other, multiplying the derivatives on each edge of the path together.
For example, to get the derivative of e with respect to b we get:

∂e
=1×2+1×3 (1)
∂b

Aditya Nigam (SCEE, IIT-Mandi) [email protected] Lecture, February - May, 2019


Calculus on Computational Graphs: Backpropagation
Factoring Paths

The problem with just summing over the paths is that its very easy to
get a combinatorial explosion in the number of possible paths.
If we want to get the derivative of Z wrt X by summing over all
paths, we need to sum over 3 × 3=9 paths:
Instead of just naively summing over the paths, it would be much
better to factor them:
∂Z
= (α + β + γ)(δ +  + ζ) (2)
∂X
This is where forward-mode differentiation and reverse-mode
differentiation come in.
Aditya Nigam (SCEE, IIT-Mandi) [email protected] Lecture, February - May, 2019
Calculus on Computational Graphs: Backpropagation

Forward-mode differentiation
Instead of summing over all of the paths explicitly, they compute the
same sum more efficiently by merging paths back together at every
node.
In fact, both algorithms touch each edge exactly once!
It starts at an input to the graph and moves towards the end. At
every node, it sums all the paths feeding in.
Each of those paths represents one way in which the input affects
that node.
By adding them up, we get the total way in which the node is
affected by the input, its derivative.

Aditya Nigam (SCEE, IIT-Mandi) [email protected] Lecture, February - May, 2019


Calculus on Computational Graphs: Backpropagation

At each node, it merges all paths which originated at that node.


Forward-mode differentiation tracks how one input affects every node.
Reverse-mode differentiation tracks how every node affects one
output.
Aditya Nigam (SCEE, IIT-Mandi) [email protected] Lecture, February - May, 2019
Calculus on Computational Graphs: Backpropagation

Forward-mode differentiation gives us the derivative of every node


with respect to b. The derivative of our output with respect to one of
our inputs.
Reverse-mode differentiation gives us the derivative of e with respect
to every node:
Forward-mode differentiation gave us the derivative of our output
with respect to a single input, but reverse-mode differentiation gives
us all of them.
Aditya Nigam (SCEE, IIT-Mandi) [email protected] Lecture, February - May, 2019
Calculus on Computational Graphs: Backpropagation

Computational Victories
Reverse-mode differentiation looks like a strange way of doing the
same thing as the forward-mode.

Is there some advantage?

Forward-mode differentiation tracks how one input affects every node.

Reverse-mode differentiation tracks how every node affects one


output.

Aditya Nigam (SCEE, IIT-Mandi) [email protected] Lecture, February - May, 2019

You might also like