LSTMRefCard Printable

The document provides a reference card for Long Short-Term Memory (LSTM) networks, detailing the functions of the Forget Gate, Input Gate, Cell State, and Output Gate. It includes mathematical formulations and Python code snippets for implementing these gates in LSTM. Additionally, it references external resources for further reading and examples.

Uploaded by

Akshay Hebbar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views1 page

LSTMRefCard Printable

Uploaded by

Akshay Hebbar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

The LSTM Reference Card

Previous Cell
State × + New Cell State
LTM LTM
× tanh

σ ×
σ tanh
σ
+ + + +
𝑊ℎ𝑓 ∙ ℎ 𝑊𝑥𝑓 ∙ 𝑥 𝑊ℎ𝑖 ∙ ℎ 𝑊𝑥𝑖 ∙ 𝑥 𝑊ℎ𝑙 ∙ ℎ 𝑊𝑥𝑙 ∙ 𝑥 𝑊ℎ𝑜 ∙ ℎ 𝑊𝑥𝑜 ∙ 𝑥
+ 𝐵ℎ𝑓 + 𝐵𝑥𝑓 + 𝐵ℎ𝑖 + 𝐵𝑥𝑖 + 𝐵ℎ𝑙 + 𝐵𝑥𝑙 + 𝐵ℎ𝑜 + 𝐵𝑥𝑜
Previous Output, aka
Hidden State Hidden State
h STM h STM
× Element-wise multiplication It is unlikely that the ideal size of the hidden state is
New Event also the desired output size of the model. In most
Size: (x,1) 𝑊𝑥 {𝑓,𝑖,𝑙,𝑜} Size: (h,x)
x cases, an LSTM layer passes its output to a final, fully
Size: (h,1) 𝑊ℎ {𝑓,𝑖,𝑙,𝑜} Size: (h,h)
connected layer which returns the desired output size.

Forget Gate
The Forget Gate is a way to selectively forget some of what the Cell State (LTM) has in memory. The New Event and the previous period’s Hidden State are summed (element-
wise) and then transformed with a sigmoid function. This output is therefore a vector with entries between 0 and 1. When the Previous Cell State is multiplied (element-wise)
with this vector, the effect is that some proportion (between 0 and 1) of each value in the Previous Cell State makes it “through the gate” and is retained; the rest is forgotten.

import numpy as np
from [Link] import expit as sigmoid
def forget_gate(x, h, Weights_hf, Bias_hf, Weights_xf, Bias_xf, prev_cell_state):
forget_hidden = [Link](Weights_hf, h) + Bias_hf
forget_eventx = [Link](Weights_xf, x) + Bias_xf
return [Link]( sigmoid(forget_hidden + forget_eventx), prev_cell_state )

Input Gate (aka Learn Gate)

The Input Gate has 2 components: a way to “Ignore” new information, and a way to “Learn” new information. In each case, the New Event and previous period’s Hidden State
are summed and transformed. The Ignore component is transformed using logic similar to the Forget Gate: a sigmoid function creates a vector of proportions (values between 0
and 1). The Learn component uses a hyperbolic tangent function, which returns a vector with values between -1 and 1; this helps the model learn both positive and negative
relationships in the data. When the Learn component is multiplied (element-wise) by the Ignore component, the effect is that some proportion of each value from the Learn
component makes it “through the gate” and is retained; the rest is ignored.

def input_gate(x, h, Weights_hi, Bias_hi, Weights_xi, Bias_xi, Weights_hl, Bias_hl, Weights_xl, Bias_xl):
ignore_hidden = [Link](Weights_hi, h) + Bias_hi
ignore_eventx = [Link](Weights_xi, x) + Bias_xi
learn_hidden = [Link](Weights_hl, h) + Bias_hl
learn_eventx = [Link](Weights_xl, x) + Bias_xl
return [Link]( sigmoid(ignore_eventx + ignore_hidden), [Link](learn_eventx + learn_hidden) )

Cell State (aka Long Term Memory)

The Cell State at each time is calculated by adding two things together: the vector from the Forget Gate, and the vector from the Input Gate. The Cell State is used in the Output
Gate (below) to determine the model’s current output; it’s also carried forward to be used for the next Event’s forward pass.

def cell_state(forget_gate_output, input_gate_output):

return forget_gate_output + input_gate_output

Output Gate (aka Use Gate)

The Output Gate returns a vector that is both the model’s output for that Event, and the new hidden state h (STM), which is carried forward to the next Event’s forward pass. The
Cell State, previous Hidden State, and New Event all contribute to this vector: the New Event and previous Hidden State are combined and multiplied (element-wise) by the
transformed Cell State.

def output_gate(x, h, Weights_ho, Bias_ho, Weights_xo, Bias_xo, cell_state):

out_hidden = [Link](Weights_ho, h) + Bias_ho
out_eventx = [Link](Weights_xo, x) + Bias_xo
return [Link]( sigmoid(out_eventx + out_hidden), [Link](cell_state) )

Full code with example at [Link]/articles/lstm-ref-card [Link] V. 001

[Link] and-gru-s-a-step-by-step-explanation-44e9eb85bf21 conditg & mricos

LSTM
No ratings yet
LSTM
11 pages
LSTM 006
No ratings yet
LSTM 006
6 pages
Introduction To Long Short Term Memory LSTM
No ratings yet
Introduction To Long Short Term Memory LSTM
6 pages
NLP - L8 LSTM
No ratings yet
NLP - L8 LSTM
7 pages
LSTM & Gru
No ratings yet
LSTM & Gru
17 pages
LSTM PPT
No ratings yet
LSTM PPT
22 pages
LSTM
No ratings yet
LSTM
14 pages
LSTM
No ratings yet
LSTM
3 pages
Long Term Memory
No ratings yet
Long Term Memory
2 pages
Understanding LSTM Networks
No ratings yet
Understanding LSTM Networks
11 pages
LSTM Architecture Explained: Key Components
No ratings yet
LSTM Architecture Explained: Key Components
19 pages
Understanding LSTM - A Simple Guide With Diagrams and Real-Time Examples - by Neural Pai - Feb, 2025 - Medium
No ratings yet
Understanding LSTM - A Simple Guide With Diagrams and Real-Time Examples - by Neural Pai - Feb, 2025 - Medium
15 pages
Long Short-Term Memory (LSTM) by Mohsin
No ratings yet
Long Short-Term Memory (LSTM) by Mohsin
17 pages
Unit 2 DL
No ratings yet
Unit 2 DL
44 pages
Unit 2 DL
No ratings yet
Unit 2 DL
43 pages
LSTM and GRU Architectures Explained
No ratings yet
LSTM and GRU Architectures Explained
18 pages
LSTM
No ratings yet
LSTM
22 pages
Addition Multiplication RNN
No ratings yet
Addition Multiplication RNN
7 pages
Advanced RNN Architectures Explained
No ratings yet
Advanced RNN Architectures Explained
60 pages
Lecture 3 LSTM, GRU
No ratings yet
Lecture 3 LSTM, GRU
45 pages
LSTM Material 1
No ratings yet
LSTM Material 1
3 pages
RNN & LSTM: Vamsi Krishna B 1 9 M E 0 2 3
No ratings yet
RNN & LSTM: Vamsi Krishna B 1 9 M E 0 2 3
14 pages
LSTM Problem
No ratings yet
LSTM Problem
2 pages
LSTM and GRU
No ratings yet
LSTM and GRU
22 pages
DL - Intro
No ratings yet
DL - Intro
35 pages
LSTM: Recurrent Neural Network Guide
No ratings yet
LSTM: Recurrent Neural Network Guide
6 pages
LSTM&RNN
No ratings yet
LSTM&RNN
10 pages
9 Deep Leaning RNN
No ratings yet
9 Deep Leaning RNN
64 pages
LSTM Presentation 1
No ratings yet
LSTM Presentation 1
10 pages
LSTM Memristive Neural Networks Survey
No ratings yet
LSTM Memristive Neural Networks Survey
14 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
66 pages
Long Short-Term Memory (LSTM)
No ratings yet
Long Short-Term Memory (LSTM)
33 pages
LSTM Gru Notes
No ratings yet
LSTM Gru Notes
8 pages
RNN
No ratings yet
RNN
28 pages
LSTM Presentation
No ratings yet
LSTM Presentation
23 pages
Recurrent Neural Network Using LSTM Model
No ratings yet
Recurrent Neural Network Using LSTM Model
15 pages
Week 6
No ratings yet
Week 6
60 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
105 pages
RNNs and LSTMs
No ratings yet
RNNs and LSTMs
41 pages
LSTM Neural Networks Explained
No ratings yet
LSTM Neural Networks Explained
4 pages
LSTMS
No ratings yet
LSTMS
14 pages
RNN and LSTM for Sentiment Analysis
No ratings yet
RNN and LSTM for Sentiment Analysis
14 pages
Illustrated Guide To LSTM's and GRU'S - A Step by Step Explanation - by Michael Phi - Towards Data Science
No ratings yet
Illustrated Guide To LSTM's and GRU'S - A Step by Step Explanation - by Michael Phi - Towards Data Science
15 pages
RNNs & LSTMs for Tech Enthusiasts
No ratings yet
RNNs & LSTMs for Tech Enthusiasts
9 pages
Neural Networks
No ratings yet
Neural Networks
22 pages
Understanding LSTM - Architecture, Pros and Cons, and Implementation - by Anishnama - Medium
No ratings yet
Understanding LSTM - Architecture, Pros and Cons, and Implementation - by Anishnama - Medium
7 pages
Understanding LSTM Networks Explained
No ratings yet
Understanding LSTM Networks Explained
7 pages
LSTMs: Tackling Vanishing Gradients
No ratings yet
LSTMs: Tackling Vanishing Gradients
17 pages
LSTM
No ratings yet
LSTM
12 pages
Module 4
No ratings yet
Module 4
14 pages
RNN and LSTM - Explanation by Example
No ratings yet
RNN and LSTM - Explanation by Example
56 pages
LSTM
No ratings yet
LSTM
19 pages
Assignmnt 2
No ratings yet
Assignmnt 2
10 pages
PDL 07-Merged
No ratings yet
PDL 07-Merged
6 pages
LSTM
No ratings yet
LSTM
24 pages
RNN LSTM
No ratings yet
RNN LSTM
72 pages
Deep Learning 2017 Lecture6RNN
No ratings yet
Deep Learning 2017 Lecture6RNN
31 pages
What Is LSTM - Long Short Term Memory - GeeksforGeeks
No ratings yet
What Is LSTM - Long Short Term Memory - GeeksforGeeks
14 pages
LSTM Networks in Python 1723896317
No ratings yet
LSTM Networks in Python 1723896317
17 pages
Learning Classifier System
No ratings yet
Learning Classifier System
30 pages
Variations of ES
No ratings yet
Variations of ES
7 pages
Intro To PSO
No ratings yet
Intro To PSO
28 pages
Augmented Intelligence
No ratings yet
Augmented Intelligence
4 pages
MCTS-Guided GA for Neural Network Optimization
No ratings yet
MCTS-Guided GA for Neural Network Optimization
5 pages
DRAEXLMAIER Quality Requirements Guide
No ratings yet
DRAEXLMAIER Quality Requirements Guide
9 pages
Week 8 Lessons 2 Q 2 Mathematics 10 Worksheet D.santOS Worksheet
No ratings yet
Week 8 Lessons 2 Q 2 Mathematics 10 Worksheet D.santOS Worksheet
4 pages
1 Ophthalmoscopes
No ratings yet
1 Ophthalmoscopes
24 pages
Charles Mann's "1491" Thesis Help
100% (1)
Charles Mann's "1491" Thesis Help
10 pages
EE8591 Digital Signal Processing Notes
No ratings yet
EE8591 Digital Signal Processing Notes
1 page
Dr. R. Vijayakumar: Microbiology Expert
No ratings yet
Dr. R. Vijayakumar: Microbiology Expert
20 pages
2 - Reviewer For PMLS1
No ratings yet
2 - Reviewer For PMLS1
2 pages
Elizabethan Era: Life, Customs, and Science
No ratings yet
Elizabethan Era: Life, Customs, and Science
2 pages
CCP Ebi V2021
No ratings yet
CCP Ebi V2021
35 pages
David S. Oderberg - Moral Theory - Non-Consequentialist Approach-Blackwell (2000)
100% (3)
David S. Oderberg - Moral Theory - Non-Consequentialist Approach-Blackwell (2000)
209 pages
Subject Enrichment Activity Class-Vii
No ratings yet
Subject Enrichment Activity Class-Vii
3 pages
Addition of Two BCD Numbers
No ratings yet
Addition of Two BCD Numbers
1 page
A Detailed Lesson Plan in Science 3
No ratings yet
A Detailed Lesson Plan in Science 3
8 pages
BBM4117 Strategic Management-9
No ratings yet
BBM4117 Strategic Management-9
88 pages
Practical Strategiesfor Writingin Plain Language
No ratings yet
Practical Strategiesfor Writingin Plain Language
3 pages
ĐỀ 1
No ratings yet
ĐỀ 1
32 pages
Beginner's Guide to Solving 3x3 Cube
No ratings yet
Beginner's Guide to Solving 3x3 Cube
15 pages
Overview of Pharmacovigilance and Good Distribution Practices: Program and Activities in Indonesia
No ratings yet
Overview of Pharmacovigilance and Good Distribution Practices: Program and Activities in Indonesia
32 pages
Horoscope - Aditya
No ratings yet
Horoscope - Aditya
7 pages
EEE241
No ratings yet
EEE241
3 pages
9a1 Science
No ratings yet
9a1 Science
3 pages
The Girl Who Couldnt See Herself.
100% (3)
The Girl Who Couldnt See Herself.
3 pages
Letter To Editor
No ratings yet
Letter To Editor
4 pages
Psychology Ppa
No ratings yet
Psychology Ppa
66 pages
Pakistan Irrigation System
No ratings yet
Pakistan Irrigation System
2 pages
Employability Skills Initiatives in Higher Education - What Effects Do They Have On Graduate Labour Market Outcomes
No ratings yet
Employability Skills Initiatives in Higher Education - What Effects Do They Have On Graduate Labour Market Outcomes
32 pages
Merilytics JD - Senior Technical Associate PDF
No ratings yet
Merilytics JD - Senior Technical Associate PDF
2 pages
BIO1102 Assignment 1
No ratings yet
BIO1102 Assignment 1
4 pages
Lip Reader Vs Speech Reader
No ratings yet
Lip Reader Vs Speech Reader
6 pages
Cambridge Social Paper1
No ratings yet
Cambridge Social Paper1
3 pages

LSTMRefCard Printable

Uploaded by

LSTMRefCard Printable

Uploaded by

The LSTM Reference Card

Input Gate (aka Learn Gate)

Cell State (aka Long Term Memory)

def cell_state(forget_gate_output, input_gate_output):

Output Gate (aka Use Gate)

def output_gate(x, h, Weights_ho, Bias_ho, Weights_xo, Bias_xo, cell_state):

Full code with example at [Link]/articles/lstm-ref-card [Link] V. 001

You might also like