0% found this document useful (0 votes)

37 views8 pages

10.5 DeepRecurrent

The document discusses Deep Recurrent Networks, focusing on their architecture and computation through parameter blocks. It explores methods to deepen RNNs, including breaking down hidden states, enhancing hidden-to-hidden computations, and introducing skip connections to mitigate path-lengthening effects. The content emphasizes the balance between representational capacity and optimization challenges in deeper architectures.

Uploaded by

Aravind Goud

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views8 pages

10.5 DeepRecurrent

Uploaded by

Aravind Goud

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Deep Learning Srihari

Deep Recurrent Networks

Sargur Srihari
[email protected]

1
Deep Learning Srihari
Topics

• Recurrent Neural Networks

1. Unfolding Computational Graphs
2. Recurrent Neural Networks
3. Bidirectional RNNs
4. Encoder-Decoder Sequence-to-Sequence Architectures
5. Deep Recurrent Networks
6. Recursive Neural Networks
7. The Challenge of Long-Term Dependencies
8. Echo-State Networks
9. Leaky Units and Other Strategies for Multiple Time Scales
10. LSTM and Other Gated RNNs
11. Optimization for Long-Term Dependencies 2
12. Explicit Memory
Deep Learning Srihari

Computation in RNNs: parameter blocks

• The computation in most recurrent neural networks can be

decomposed into three blocks of parameters and associated
transformations:
1. From the input to the hidden state
2. From the previous hidden state to the next hidden state
3. From the hidden state to the output

3
Deep Learning Srihari

Blocks of parameters as a shallow transformation

• With the RNN architecture shown each of these three blocks is
associated with a single weight matrix, i.e.,
• When the network is unfolded,
each of these corresponds to a
shallow transformation.
• By a shallow Transformation we
mean a transformation that would
be represented a single layer within
a deep MLP.
• Typically this is a transformation represented by a learned affine
transformation followed by a fixed nonlinearity
• Would it be advantageous to introduce depth into each of these
. operations?
• Experimental evidence strongly suggests so. 4

• That we need enough depth in order to perform the required transformations

Deep Learning Srihari

Ways of making an RNN deep

1. Hidden recurrent state 2. Deeper computation can be 3. The path-

can be broken down into introduced in the input-hidden, lengthening effect
groups organized hidden-hidden and hidden-output can be mitigated by
parts. This may lengthen the shortest introducing skip
hierarchically path linking different time steps
connections.

5
Deep Learning Srihari

1. Recurrent states broken down into groups

We can think of lower levels of the

hierarchy play a role of transforming the
raw input into a representation that is
more appropriate at the higher levels of
the hidden state

6
Deep Learning Srihari

2. Deeper computation in hidden-to-hidden

• Go a step further and propose to have a separate

MLP (possibly deep) for each of the three blocks:
1. From the input to the hidden state
2. From the previous hidden state to the next hidden state
3. From the hidden state to the output
• Considerations of representational capacity
suggest that to allocate enough capacity in each of
these three steps
• But doing so by adding depth may hurt learning by
making optimization difficult
• In general it is easier to optimize shallower architectures
• Adding the extra depth makes the shortest time of a
variable from time step t to a variable in time step t+1 7
beome longer
Deep Learning Srihari

3. Introducing skip connections

• For example, if an MLP with a single

hidden layer is used for the state-to-
state transition, we have doubled the
length of the shortest path between
variables in any two different time steps
compared with the ordinary RNN.
• This can be mitigated by introducing
skip connections in the hidden-to-hidden
path as illustrated here

10.0 SequenceModeling
No ratings yet
10.0 SequenceModeling
27 pages
10.6 RecursiveNNs
No ratings yet
10.6 RecursiveNNs
14 pages
Unfolding Computational Graphs
No ratings yet
Unfolding Computational Graphs
16 pages
Lec14 RNN3 8 Feb 18
No ratings yet
Lec14 RNN3 8 Feb 18
16 pages
Bidirectional RNNs in Deep Learning
No ratings yet
Bidirectional RNNs in Deep Learning
10 pages
Deep Unit 3 F
No ratings yet
Deep Unit 3 F
51 pages
10.2.4 RNN-Context
No ratings yet
10.2.4 RNN-Context
10 pages
Deep Feedforward Networks Guide
No ratings yet
Deep Feedforward Networks Guide
103 pages
Unit III (2) RNN, LSTM, Gru
No ratings yet
Unit III (2) RNN, LSTM, Gru
14 pages
DL Co3 - PPT 1
No ratings yet
DL Co3 - PPT 1
22 pages
Lecture Notes - RRN
No ratings yet
Lecture Notes - RRN
8 pages
Module 4-1
No ratings yet
Module 4-1
44 pages
4 RNN
No ratings yet
4 RNN
65 pages
Unit 4 - Machine Learning - WWW - Rgpvnotes.in
0% (1)
Unit 4 - Machine Learning - WWW - Rgpvnotes.in
16 pages
6.1 DeepFFNets
No ratings yet
6.1 DeepFFNets
47 pages
Stock Price Prediction with Neural Networks
No ratings yet
Stock Price Prediction with Neural Networks
6 pages
28-Recurrent Neural Networks - Bidirectional RNNs-19!09!2024
No ratings yet
28-Recurrent Neural Networks - Bidirectional RNNs-19!09!2024
12 pages
Recurrent Neural Networks (RNNS) : A Gentle Introduction and Overview
No ratings yet
Recurrent Neural Networks (RNNS) : A Gentle Introduction and Overview
16 pages
LSTM RNNs for Sequence Generation
No ratings yet
LSTM RNNs for Sequence Generation
43 pages
DL U-Ii
No ratings yet
DL U-Ii
41 pages
Lec 4 Recurrent Neural Network Long Short-Term Memory
No ratings yet
Lec 4 Recurrent Neural Network Long Short-Term Memory
32 pages
Advanced RNN Design & Applications
No ratings yet
Advanced RNN Design & Applications
41 pages
CS60010: Deep Learning: Recurrent Neural Network
No ratings yet
CS60010: Deep Learning: Recurrent Neural Network
44 pages
Lab 9 RNN
No ratings yet
Lab 9 RNN
8 pages
Module 6
No ratings yet
Module 6
51 pages
Issues in The Feed-Forward Neural Network
No ratings yet
Issues in The Feed-Forward Neural Network
9 pages
Unit 5 Updated
No ratings yet
Unit 5 Updated
125 pages
DL 4
No ratings yet
DL 4
11 pages
Soft Computing 1
No ratings yet
Soft Computing 1
15 pages
Fraccaro 2016
No ratings yet
Fraccaro 2016
10 pages
Blue and White Simple Business Plan Presentation
No ratings yet
Blue and White Simple Business Plan Presentation
15 pages
VAE Applications and Summary
No ratings yet
VAE Applications and Summary
29 pages
RNNs: A Guide for AI Enthusiasts
No ratings yet
RNNs: A Guide for AI Enthusiasts
83 pages
Convolutional Neural Networks (CNNS)
No ratings yet
Convolutional Neural Networks (CNNS)
10 pages
DL For Sequencial Data
No ratings yet
DL For Sequencial Data
36 pages
Introd 02
No ratings yet
Introd 02
32 pages
Dr. Ahmad Al-Mahasneh
No ratings yet
Dr. Ahmad Al-Mahasneh
32 pages
Module5 Notes
No ratings yet
Module5 Notes
23 pages
AD3501 DL UNIT 3 Notes - Nil AD3501 DL UNIT 3 Notes - Nil
No ratings yet
AD3501 DL UNIT 3 Notes - Nil AD3501 DL UNIT 3 Notes - Nil
31 pages
Module 5 (Chapter 10)
No ratings yet
Module 5 (Chapter 10)
17 pages
RNNs: Types, Architecture, and Issues
No ratings yet
RNNs: Types, Architecture, and Issues
97 pages
Chap 7.2 Sequence Analysis Using RNN LSTM
No ratings yet
Chap 7.2 Sequence Analysis Using RNN LSTM
60 pages
DL Module 5
No ratings yet
DL Module 5
10 pages
Recurrent and Recursive Neural Networks
No ratings yet
Recurrent and Recursive Neural Networks
19 pages
06 - LLM
No ratings yet
06 - LLM
18 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
6 pages
DL Unit-4
No ratings yet
DL Unit-4
31 pages
Sequence Modeling Recurrent Neural Networks
No ratings yet
Sequence Modeling Recurrent Neural Networks
18 pages
RNNs
No ratings yet
RNNs
22 pages
31-Architectures, Deep Recurrent Networks, Auto Encoders-26!09!2024
No ratings yet
31-Architectures, Deep Recurrent Networks, Auto Encoders-26!09!2024
34 pages
RNN and LSTM Tutorial Overview
No ratings yet
RNN and LSTM Tutorial Overview
15 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
36 pages
Dis6 Sol
No ratings yet
Dis6 Sol
6 pages
Deep Learning Notes
100% (1)
Deep Learning Notes
44 pages
Explain The Concept of Unfolding Computational Graphs in The Context of Recurrent Neural Networks
No ratings yet
Explain The Concept of Unfolding Computational Graphs in The Context of Recurrent Neural Networks
9 pages
ArchitectureDesign For DeepLearning
No ratings yet
ArchitectureDesign For DeepLearning
34 pages
4HYD211 - Unit Hydrograph
No ratings yet
4HYD211 - Unit Hydrograph
14 pages
A Learned Cuckoo Filter For Approximate Membership Queries Over Variable-Sized Sliding Windows On Data Streams
No ratings yet
A Learned Cuckoo Filter For Approximate Membership Queries Over Variable-Sized Sliding Windows On Data Streams
26 pages
Ciena - C+L Band Business Case Model - Part-4
No ratings yet
Ciena - C+L Band Business Case Model - Part-4
2 pages
Class Test 2 - Forecasting
No ratings yet
Class Test 2 - Forecasting
8 pages
Slides On Data Structures Tree and Graph
No ratings yet
Slides On Data Structures Tree and Graph
94 pages
Data Mining: Class Imbalance Solutions
No ratings yet
Data Mining: Class Imbalance Solutions
56 pages
Joseph V. Tranquillo - An Introduction To Complex Systems - Making Sense of A Changing World - Springer (2019) PDF
100% (6)
Joseph V. Tranquillo - An Introduction To Complex Systems - Making Sense of A Changing World - Springer (2019) PDF
405 pages
Me-Pse Curriculum and Syllabus
No ratings yet
Me-Pse Curriculum and Syllabus
73 pages
Integration Booklet 2 - McGrathematics
No ratings yet
Integration Booklet 2 - McGrathematics
24 pages
Flowcharts
No ratings yet
Flowcharts
32 pages
Heuristics for Problem Solving
No ratings yet
Heuristics for Problem Solving
4 pages
Second Order Differential Equations Guide
No ratings yet
Second Order Differential Equations Guide
8 pages
FX Options Vanna Volga
No ratings yet
FX Options Vanna Volga
70 pages
ADA Unit-3
No ratings yet
ADA Unit-3
45 pages
P-3.1.4 - Pca
No ratings yet
P-3.1.4 - Pca
44 pages
Midterm 2025
No ratings yet
Midterm 2025
4 pages
Chapter 11 - Graphs
No ratings yet
Chapter 11 - Graphs
23 pages
Size Considerations For Public and Private Keys - IBM Documentation
No ratings yet
Size Considerations For Public and Private Keys - IBM Documentation
3 pages
07 Stat2 Exercise Set7 Solutions
No ratings yet
07 Stat2 Exercise Set7 Solutions
2 pages
Viva Questions For DAA UoP
No ratings yet
Viva Questions For DAA UoP
10 pages
Unit 4
No ratings yet
Unit 4
38 pages
Chapter 1 (Introduction To Quantitative Analysis)
No ratings yet
Chapter 1 (Introduction To Quantitative Analysis)
17 pages
RFC 5246 - The Transport Layer Security (TLS) Protocol Version 1.2
No ratings yet
RFC 5246 - The Transport Layer Security (TLS) Protocol Version 1.2
104 pages
PSO Codes Matlab
No ratings yet
PSO Codes Matlab
4 pages
Reference 3
No ratings yet
Reference 3
5 pages
Advanced Sampling Rate Conversion
No ratings yet
Advanced Sampling Rate Conversion
25 pages
Data Analytics CLP LED
No ratings yet
Data Analytics CLP LED
15 pages
Topic 8 Plan
No ratings yet
Topic 8 Plan
3 pages
Gradient Descent
No ratings yet
Gradient Descent
2 pages
Faculty Positions in Data Science at IISc
No ratings yet
Faculty Positions in Data Science at IISc
2 pages

10.5 DeepRecurrent

Uploaded by

10.5 DeepRecurrent

Uploaded by

Deep Learning Srihari

Deep Recurrent Networks

• Recurrent Neural Networks

Computation in RNNs: parameter blocks

• The computation in most recurrent neural networks can be

Blocks of parameters as a shallow transformation

• That we need enough depth in order to perform the required transformations

Ways of making an RNN deep

1. Hidden recurrent state 2. Deeper computation can be 3. The path-

1. Recurrent states broken down into groups

We can think of lower levels of the

2. Deeper computation in hidden-to-hidden

• Go a step further and propose to have a separate

3. Introducing skip connections

• For example, if an MLP with a single

You might also like