Unit 4
Deep Learning For Text
And Sequences
Presented By
Syed Ateeq
As Per Dr Babasaheb Ambedkar Technological University
Introduction To Sequential/Temporal Data
Sequential/Temporal data in deep learning
refers to data where the order of elements
matters.
Examples include time series data (like
stock prices or sensor readings) and
natural language (text).
Sequential Data: Data where the order of
elements is crucial. This means the
position/sequence of elements within data
2
influences its meaning or interpretation.
Prepared By: Syed Ateeq
Introduction To Sequential/Temporal Data
Temporal Data:
A specific type of sequential data where
the order is based on time.
Examples include time series data, where
data points are indexed by time.
Sequential data like time series and
natural language require models that can
capture ordering and context.
While time series analysis focuses on
3
forecasting based on temporal patterns,
Prepared By: Syed Ateeq
Introduction To Sequential/Temporal Data
natural language processing aims to
extract semantic meaning from word
sequences.
A time series is a set of observations over
time that are ordered chronologically and
sampled at fixed time intervals. Some
examples include:
Stock prices every day
Server metrics every hour
4
Temperature readings every second
Prepared By: Syed Ateeq
Sequential Models
A sequential model is a type of model
where data is processed in a specific
order, and the context of previous data
points is crucial for prediction.
These models are designed to handle
sequential data, such as text, audio, or
time-series data, where the order of
elements is significant.
Unlike traditional CNNs that process
5
spatial data like images, sequential
Prepared By: Syed Ateeq
Sequential Models
models are well-suited for tasks involving
sequential dependencies.
6
Prepared By: Syed Ateeq
Sequential Models
Key Characteristics:
Sequential Data:
These models deal with data where the
order of elements matters, like words in a
sentence, frames in a video, or data points
in a time series.
Contextual Information:
The model learns from previous data
points to make predictions, understanding
7
the relationships between them.
Prepared By: Syed Ateeq
Sequential Models
Common Architectures:
Recurrent Neural Networks (RNNs) and
Transformers are prominent examples of
sequential model architectures.
Applications:
These models are widely used in tasks
such as:
Natural Language Processing (NLP):
Machine translation, text generation,
8
sentiment analysis. Prepared By: Syed Ateeq
Sequential Models
Speech Recognition:
Converting spoken language to text.
Time Series Forecasting:
Predicting future values based on past
data.
Music Generation: Creating new music
pieces.
9
Prepared By: Syed Ateeq
Introduction to Recurrent Neural Networks
Recurrent Neural Networks (RNNs) differ
from regular neural networks in how they
process information.
While standard neural networks pass
information in one direction i.e. from
input to output, RNNs feed information
back into the network at each step.
Imagine reading a sentence and you try to
predict the next word, you don’t rely only
10
on the current word but also remember
Prepared By: Syed Ateeq
Introduction to Recurrent Neural Networks
the words that came before.
RNNs work similarly by remembering
past information and passing the output
from one step as input to the next i.e. it
considers all the earlier words to choose
the most likely next word.
This memory of previous steps helps the
network understand context and make
better predictions.
11
Prepared By: Syed Ateeq
Introduction to Recurrent Neural Networks
Key Components of RNNs:
There are mainly two components of
RNNs:
1. Recurrent Neurons:
The fundamental processing unit in RNN
is a Recurrent Unit.
They hold a hidden state that maintains
information about previous inputs in a
sequence.
Recurrent units can remember
12
Prepared By: Syed Ateeq
Introduction to Recurrent Neural Networks
information from prior steps by feeding
back their hidden state, allowing them to
capture dependencies across time.
13
Recurrent Neuron
Prepared By: Syed Ateeq
Introduction to Recurrent Neural Networks
2. RNN Unfolding:
RNN unfolding or unrolling is the process
of expanding the recurrent structure over
time steps.
During unfolding each step of the
sequence is represented as a separate layer
in a series illustrating how information
flows across each time step.
This unrolling enables back-propagation
14
through time (BPTT) a learning process
Prepared By: Syed Ateeq
Introduction to Recurrent Neural Networks
where errors are propagated across time
steps to adjust the network’s weights
enhancing the RNN’s ability to learn
dependencies within sequential data.
15
RNN Unfolding
Prepared By: Syed Ateeq
Working Of RNN
1) Sequence Input:
RNNs take a sequence of inputs as their
primary data format.
Each element in the sequence corresponds
to a time step, and the goal is to process
the sequence step by step to learn patterns
and relationships.
2. Hidden State:
At each time step, an RNN maintains a
16
hidden state vector, which captures
Prepared By: Syed Ateeq
Working Of RNN
information about the current input and
the previous hidden state.
This hidden state serves as the memory of
the network and allows it to maintain
context from earlier time steps.
3. Input and Hidden State Interaction:
The input at each time step is combined
with the previous hidden state to produce
a new hidden state.
This combination involves matrix
17
Prepared By: Syed Ateeq
Working Of RNN
multiplication and activation functions,
often a hyperbolic tangent (tanh) or
rectified linear unit (ReLU).
4. Updating Hidden State
5.Sequence Processing:
The RNN processes the entire sequence
by iteratively updating the hidden state for
each time step.
As it progresses through the sequence, the
18
hidden state accumulates information
Prepared By: Syed Ateeq
Working Of RNN
about the previous inputs and hidden
states.
6. Output Generation:
Depending on the task, an RNN can
produce an output at each time step or
only at the final time step.
For example, in language modeling, an
output might be generated at each step to
predict the next word, whereas in
19
sequence classification, the final hidden
Prepared By: Syed Ateeq
Working Of RNN
state might be used to produce the final
prediction.
7. Back-propagation Through Time
(BPTT):
To train an RNN, the process of back-
propagation is extended through time.
Errors are propagated backward from the
output predictions to the initial time step,
updating weights to minimize difference
20
between predicted and actual values.
Prepared By: Syed Ateeq
Representing Sequential Data using RNN
1. One-to-One RNN:
This is the simplest type of
neural network architecture
where there is a single input
and a single output.
It is used for straightforward
classification tasks such as
binary classification where
no sequential data is
21
involved.
Prepared By: Syed Ateeq
Representing Sequential Data using RNN
2. One-to-Many RNN:
In a One-to-Many RNN the network
processes a single input to produce
multiple outputs over time.
This is useful in tasks where one input
triggers a sequence of predictions
(outputs).
For example in image captioning a single
image can be used as input to generate a
22
sequence of words as a caption.
Prepared By: Syed Ateeq
Representing Sequential Data using RNN
23
Prepared By: Syed Ateeq
Representing Sequential Data using RNN
3. Many-to-One RNN:
The Many-to-One RNN receives a
sequence of inputs and generates a single
output.
This type is useful when the overall
context of the input sequence is needed to
make one prediction.
In sentiment analysis the model receives a
sequence of words and produces a single
24
output like positive, negative or neutral.
Prepared By: Syed Ateeq
Representing Sequential Data using RNN
25
Prepared By: Syed Ateeq
Representing Sequential Data using RNN
4. Many-to-Many RNN:
The Many-to-Many RNN type processes
a sequence of inputs and generates a
sequence of outputs.
In language translation task a sequence of
words in one language is given as input
and a corresponding sequence in another
language is generated as output.
26
Prepared By: Syed Ateeq
Representing Sequential Data using RNN
27
Prepared By: Syed Ateeq
Working With Text Data
Working with text data in deep learning,
often referred to as Natural Language
Processing (NLP), involves
preprocessing, transforming text into
numerical representations, and building
models to analyze and generate human
language.
Key steps include cleaning and
normalizing text, tokenization, creating
28
word embedding, and using various deep
Prepared By: Syed Ateeq
Working With Text Data
learning architectures like LSTMs and
CNNs for tasks like classification,
translation, and text generation.
1. Preprocessing and Cleaning:
Data Collection and Loading: Gather text
data from various sources (e.g., websites,
files).
Cleaning: Remove irrelevant information
like HTML tags, punctuation, and stop
29
words (common words like "the," "a").
Prepared By: Syed Ateeq
Working With Text Data
Normalization: Convert text to
lowercase, handle contractions, and
standardize formats.
2. Text Representation and
Vectorization:
Tokenization:
Break down text into smaller units
30
(tokens) like words or sentences.
Prepared By: Syed Ateeq
Working With Text Data
Word Embedings:
Convert words into numerical vectors,
capturing semantic relationships between
words.
Bag-of-Words (BoW):
Create a matrix where each row
represents a document, and each column
represents a unique word, counting word
31
occurrences.
Prepared By: Syed Ateeq
Working With Text Data
TF-IDF (Term Frequency-Inverse
Document Frequency):
Weight words based on their frequency in
a document and importance across the
entire corpus.
3. Deep Learning Models for Text Data:
LSTMs (Long Short-Term Memory):
Recurrent neural networks well-suited for
sequential data like text, capturing long-
32 range dependencies between words.
Prepared By: Syed Ateeq
Working With Text Data
CNNs (Convolutional Neural
Networks):
Excellent for extracting local features
from text, often used in conjunction with
LSTMs or other models.
Transformers:
Powerful models, like BERT, which have
revolutionized NLP with their ability to
process entire sentences and capture
33
complex relationships. Prepared By: Syed Ateeq
Working With Text Data
4. Common NLP Tasks:
Text Classification: Assigning categories
to text, like sentiment analysis or spam
detection.
Text Generation: Creating new text, such
as writing summaries, essays, or
translating languages.
Machine Translation: Converting text
from one language to another.
34
Prepared By: Syed Ateeq