BIT4133 NLP
All notes in a single document
Week: 1-4 (Summarized for revision)
Week 6- 14 Detailed for practice.
Week 1: Introduction to NLP and
Applications
Objective:
Understand the fundamentals of NLP and its real-world applications.
Explore core NLP tasks like speech recognition, machine translation, sentiment
analysis, and chatbots.
Conduct research on real-world use cases of NLP.
Key Terms & Definitions
1. Natural Language Processing (NLP) – The field of AI that enables computers to
understand, interpret, and generate human language.
2. Tokenization – Splitting text into words (word tokenization) or sentences (sentence
tokenization).
3. Lemmatization – Converting words to their base or dictionary form (e.g., "running" →
"run").
4. Named Entity Recognition (NER) – Identifying names of people, locations, dates, etc.
in text.
5. Sentiment Analysis – Determining whether text expresses positive, negative, or
neutral sentiment.
6. Machine Translation (MT) – Automatically translating text from one language to
another.
7. Chatbots – AI-driven conversational agents that interact via text or voice.
Class Task 1: Implement a Simple NLP Pipeline
Step-by-Step Code Explanation
1. Install Required Libraries
pip install nltk spacy
python -m spacy download en_core_web_sm
2. Import Required Libraries
import spacy
# Load NLP model
nlp = spacy.load("en_core_web_sm")
# Sample text
text = "OpenAI developed ChatGPT in 2023. It is widely used in the tech
industry."
# Process text
doc = nlp(text)
# Tokenization, Named Entity Recognition (NER), and POS tagging
for token in doc:
print(f"Token: {token.text}, POS: {token.pos_}")
for ent in doc.ents:
print(f"Entity: {ent.text}, Label: {ent.label_}")
Expected Output
Tokens and POS tags
Extracted entities (e.g., OpenAI, ChatGPT, 2023)
Assignment: Research Real-World NLP Applications
Research three real-world NLP applications (e.g., Google Translate, Alexa,
ChatGPT).
Write a one-page report on their use cases, challenges, and impact.
Week 2: N-gram Language Models and Part-
of-Speech Tagging
Objective:
Learn about N-gram language models for predicting the next word in a sequence.
Implement an N-gram model for text generation.
Perform Part-of-Speech (POS) tagging using NLTK.
Key Terms & Definitions
1. N-gram Model – A probabilistic model that predicts the next word based on the
previous N-1 words.
2. Unigram, Bigram, Trigram – N-gram models where N = 1, 2, 3 respectively.
3. POS Tagging – Assigning grammatical tags (noun, verb, etc.) to words.
Class Task: Implement an N-gram Model in Python
1. Install NLTK
pip install nltk
2. Implement a Bigram Model
import nltk
from nltk import bigrams
from nltk.probability import FreqDist
nltk.download('reuters')
from nltk.corpus import reuters
# Load a sample text
words = list(reuters.words(categories='crude'))
# Generate bigrams
bi_grams = list(bigrams(words))
# Compute frequency distribution
fdist = FreqDist(bi_grams)
# Print most common bigrams
print(fdist.most_common(10))
Expected Output
List of the most frequent bigrams (word pairs) in the Reuters dataset.
Assignment: POS Tagging Using NLTK
Use NLTK’s POS tagger to tag words in a sample sentence.
Analyze the output and describe the most common POS tags.
Week 3: Hidden Markov Models (HMM) and
Sequence Labeling
Objective:
Understand Hidden Markov Models (HMMs) for sequence-based classification.
Implement a POS tagger using HMMs.
Key Terms & Definitions
1. Hidden Markov Model (HMM) – A probabilistic model used for sequence prediction
tasks (e.g., POS tagging, speech recognition).
2. Transition Probabilities – The probability of moving from one hidden state (POS tag)
to another.
3. Emission Probabilities – The probability of a word appearing given a POS tag.
Class Task: Implement a Simple HMM POS Tagger
1. Install Required Libraries
pip install hmmlearn nltk
2. Train an HMM for POS Tagging
import nltk
from nltk.tag import hmm
nltk.download('treebank')
# Load dataset
tagged_sentences = nltk.corpus.treebank.tagged_sents()
# Train HMM POS tagger
trainer = hmm.HiddenMarkovModelTrainer()
hmm_tagger = trainer.train(tagged_sentences[:3000])
# Test on a new sentence
test_sentence = nltk.word_tokenize("The stock market crashed yesterday.")
tagged_sentence = hmm_tagger.tag(test_sentence)
print(tagged_sentence)
Expected Output
Tagged words with their respective POS labels.
Assignment: Implement Named Entity Recognition Using
HMMs
Modify the HMM model to detect named entities instead of POS tags.
Week 4: Syntactic and Semantic Analysis
Objective:
Perform syntax analysis using dependency parsing.
Understand semantic role labeling in NLP.
Key Terms & Definitions
1. Syntax Analysis – Analyzing the grammatical structure of a sentence.
2. Dependency Parsing – Understanding relationships between words (subject, object,
verb).
3. Semantic Role Labeling (SRL) – Identifying the role of words in a sentence (who did
what to whom).
Class Task: Analyze Syntax of a Sentence Using spaCy
1. Install spaCy
pip install spacy
python -m spacy download en_core_web_sm
2. Implement Dependency Parsing
import spacy
nlp = spacy.load("en_core_web_sm")
sentence = "The cat sat on the mat."
doc = nlp(sentence)
for token in doc:
print(f"Word: {token.text}, Dependency: {token.dep_}, Head:
{token.head.text}")
Expected Output
Dependency relationships between words (e.g., subject, verb, object).
Assignment: Perform Semantic Role Labeling (SRL)
Use a pre-trained SRL model (e.g., AllenNLP) to label roles in a sentence.
Week 5- CAT 1
Week 6: Information Extraction (IE) and Named Entity
Recognition (NER)
Objective:
Understand Information Extraction (IE) and its components (Named Entity
Recognition, Relation Extraction, and Event Extraction).
Learn about Named Entity Recognition (NER) and its applications in NLP.
Develop a NER system using Python and spaCy/NLTK.
Extract entities, relationships, and facts from unstructured text data.
Key Terms & Definitions
1. Information Extraction (IE) – The process of automatically extracting structured data
(entities, relations, facts) from unstructured text.
2. Named Entity Recognition (NER) – Identifying and classifying named entities
(persons, locations, organizations, dates, etc.) in text.
3. Relation Extraction (RE) – Identifying relationships between extracted entities (e.g.,
"Barack Obama" is the "President of the USA").
4. Event Extraction – Identifying important events from text (e.g., “COVID-19 pandemic
started in 2019”).
5. Tokenization – Splitting text into words or sentences for analysis.
6. Part-of-Speech (POS) Tagging – Identifying the grammatical role of words (noun,
verb, adjective, etc.).
7. Dependency Parsing – Analyzing the grammatical structure of a sentence.
Class Task 1: Implement Named Entity Recognition (NER)
Using spaCy
Step-by-Step Code Explanation
1. Install Required Libraries
pip install spacy
python -m spacy download en_core_web_sm
2. Import Required Libraries
import spacy
# Load the English NLP model
nlp = spacy.load("en_core_web_sm")
3. Apply Named Entity Recognition (NER) on Sample Text
# Sample text
text = "Elon Musk founded SpaceX in 2002. Tesla is headquartered in Palo Alto,
California."
# Process text
doc = nlp(text)
# Extract and print named entities
for ent in doc.ents:
print(f"Entity: {ent.text}, Label: {ent.label_}")
Expected Output
Extracted entities and their types:
Entity: Elon Musk, Label: PERSON
Entity: SpaceX, Label: ORG
Entity: 2002, Label: DATE
Entity: Tesla, Label: ORG
Entity: Palo Alto, Label: GPE
Entity: California, Label: GPE
Class Task 2: Implement a Custom Information Extraction
System
1. Extract Entities and Their Relations from Text
# Import dependency parser
for token in doc:
print(f"Word: {token.text}, POS: {token.pos_}, Dependency: {token.dep_},
Head: {token.head.text}")
2. Identify Relationships Between Entities
import networkx as nx
import matplotlib.pyplot as plt
# Create a graph for relationships
graph = nx.DiGraph()
for token in doc:
if token.dep_ in ("nsubj", "dobj", "pobj"): # Identify subjects and
objects
graph.add_edge(token.head.text, token.text)
# Visualize the entity relationships
plt.figure(figsize=(6, 4))
nx.draw(graph, with_labels=True, node_color='lightblue', edge_color='gray')
plt.show()
Expected Output
A graph visualization of relationships between words.
Assignments
Assignment 1: Train a Custom NER Model on a Domain-Specific Dataset
Task:
Use spaCy to fine-tune an NER model for extracting entities from medical or legal
documents.
Sample Code Structure:
import spacy
from spacy.training import Example
nlp = spacy.blank("en") # Create a blank English NLP model
ner = nlp.add_pipe("ner") # Add Named Entity Recognizer
# Define custom training data
TRAIN_DATA = [
("Apple is based in Cupertino.", {"entities": [(0, 5, "ORG"), (16, 25,
"GPE")]}),
("Microsoft was founded by Bill Gates.", {"entities": [(0, 9, "ORG"), (24,
34, "PERSON")]}),
]
# Train the model
optimizer = nlp.begin_training()
for itn in range(10):
losses = {}
for text, annotations in TRAIN_DATA:
example = Example.from_dict(nlp.make_doc(text), annotations)
nlp.update([example], losses=losses)
print(f"Losses at iteration {itn}: {losses}")
Expected Output
A custom-trained NER model capable of recognizing specific entities.
Assignment 2: Extract Information from News Articles or Wikipedia Pages
Task:
Use BeautifulSoup or Newspaper3k to extract text from a news article and apply NER.
Sample Code Structure:
from newspaper import Article
# Extract text from a news article
url = "https://www.bbc.com/news/world-us-canada-66080367"
article = Article(url)
article.download()
article.parse()
# Process article text
doc = nlp(article.text)
# Print named entities
for ent in doc.ents:
print(f"Entity: {ent.text}, Label: {ent.label_}")
Expected Output
Named entities extracted from real-world news data.
Conclusion
This lesson covers Information Extraction (IE) and Named Entity Recognition (NER) using
Python and spaCy, including entity recognition, relationship extraction, and custom model
training.
Week 7: Machine Translation and Statistical Models
Objective:
Understand machine translation (MT) and its types (Rule-based, Statistical, and
Neural MT).
Learn about Statistical Machine Translation (SMT) and Neural Machine Translation
(NMT).
Implement and experiment with translation models in Python.
Use seq2seq models with attention for improved translation performance.
Key Terms & Definitions
1. Machine Translation (MT) – The use of software to automatically translate text
between languages.
2. Rule-Based Machine Translation (RBMT) – Uses linguistic rules and dictionaries
for translation.
3. Statistical Machine Translation (SMT) – Uses probability-based models to generate
translations.
4. Neural Machine Translation (NMT) – Uses deep learning models (RNNs, LSTMs,
Transformers) for translation.
5. Seq2Seq Model – A deep learning model that processes sequences (input → encoder →
decoder → output).
6. Attention Mechanism – A technique that helps NMT models focus on relevant words
during translation.
7. BLEU Score – A metric for evaluating translation quality by comparing machine-
generated translations to human translations.
Class Task 1: Implement a Simple Statistical Machine
Translation Model
Step-by-Step Code Explanation
1. Install Required Libraries
pip install nltk
2. Import Required Libraries
import nltk
from nltk.translate.ibm1 import IBMModel1
from nltk.translate.api import AlignedSent
nltk.download('comtrans')
3. Load Parallel Sentences for Training
from nltk.corpus import comtrans
# Load English-French sentence pairs
english_sents = comtrans.aligned_sents('alignment-en-fr.txt')[:1000]
4. Train an IBM Model 1 for Word Alignment
aligned_sentences = [AlignedSent(pair.words, pair.mots) for pair in
english_sents]
# Train IBM Model 1
ibm1 = IBMModel1(aligned_sentences, 10) # Train for 10 iterations
5. Translate a Simple Sentence
test_sentence = ["the", "house", "is", "blue"]
translation = [ibm1.translation_table[word].max() for word in test_sentence]
print("Translated Sentence:", " ".join(translation))
Expected Output
The model provides a simple English-to-French word translation.
Class Task 2: Implement a Simple Neural Machine
Translation Model (Seq2Seq)
1. Install TensorFlow
pip install tensorflow
2. Import Necessary Libraries
import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
import numpy as np
3. Define Training Data
# Sample English-French sentence pairs
english_sentences = ["hello", "how are you", "good morning", "thank you"]
french_sentences = ["bonjour", "comment ça va", "bon matin", "merci"]
# Tokenize English and French sentences
tokenizer_en = Tokenizer()
tokenizer_fr = Tokenizer()
tokenizer_en.fit_on_texts(english_sentences)
tokenizer_fr.fit_on_texts(french_sentences)
sequences_en = tokenizer_en.texts_to_sequences(english_sentences)
sequences_fr = tokenizer_fr.texts_to_sequences(french_sentences)
# Pad sequences for uniform length
sequences_en = pad_sequences(sequences_en, padding='post')
sequences_fr = pad_sequences(sequences_fr, padding='post')
# Convert to NumPy arrays
X = np.array(sequences_en)
y = np.array(sequences_fr)
4. Build a Simple Seq2Seq Translation Model
# Define model
model = tf.keras.Sequential([
tf.keras.layers.Embedding(input_dim=len(tokenizer_en.word_index) + 1,
output_dim=8),
tf.keras.layers.LSTM(64, return_sequences=True),
tf.keras.layers.LSTM(64),
tf.keras.layers.Dense(len(tokenizer_fr.word_index) + 1,
activation='softmax')
])
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam',
metrics=['accuracy'])
model.summary()
5. Train the Model
model.fit(X, y, epochs=100, batch_size=8)
6. Translate a Sentence
def translate_sentence(sentence):
seq = tokenizer_en.texts_to_sequences([sentence])
seq = pad_sequences(seq, maxlen=X.shape[1], padding='post')
prediction = model.predict(seq)
translated_seq = np.argmax(prediction, axis=-1)
words = [tokenizer_fr.index_word[i] for i in translated_seq if i != 0]
return " ".join(words)
print("English: hello")
print("French:", translate_sentence("hello"))
Expected Output
The model translates simple English phrases into French.
Assignments
Assignment 1: Train a Transformer-Based Translation Model
Task:
Implement a Transformer-based translation model using TensorFlow and the
Hugging Face library.
Sample Code Structure:
from transformers import MarianMTModel, MarianTokenizer
# Load a pre-trained English-French translation model
model_name = "Helsinki-NLP/opus-mt-en-fr"
tokenizer = MarianTokenizer.from_pretrained(model_name)
model = MarianMTModel.from_pretrained(model_name)
def translate_text(text):
inputs = tokenizer(text, return_tensors="pt", padding=True,
truncation=True)
translated = model.generate(**inputs)
return tokenizer.decode(translated[0], skip_special_tokens=True)
print(translate_text("How are you?"))
Expected Output
The model translates longer and more complex sentences.
Assignment 2: Evaluate Machine Translation Performance with BLEU Score
Task:
Implement BLEU Score evaluation to compare machine-generated translations with
reference translations.
Sample Code Structure:
from nltk.translate.bleu_score import sentence_bleu
# Reference and candidate translations
reference = [["bonjour", "comment", "ça", "va"]]
candidate = ["bonjour", "ça", "va"]
# Compute BLEU score
score = sentence_bleu(reference, candidate)
print("BLEU Score:", score)
Expected Output
A BLEU score indicating translation accuracy.
Conclusion
This lesson covers machine translation using statistical and neural models, including IBM
Model 1, Seq2Seq models, and Transformers.
Week 8: Introduction to Deep Learning for NLP
Objective:
Understand the basics of Deep Learning in NLP.
Learn about word embeddings and why they are important.
Train a Word2Vec model to learn vector representations of words.
Explore applications of word embeddings in NLP tasks like text classification and
sentiment analysis.
Key Terms & Definitions
1. Deep Learning for NLP – The use of deep neural networks to process and understand
natural language.
2. Word Embeddings – Represent words as dense numerical vectors in a multi-
dimensional space.
3. Word2Vec – A deep learning model that learns word embeddings using CBOW
(Continuous Bag of Words) or Skip-gram architectures.
4. CBOW (Continuous Bag of Words) – Predicts a target word based on surrounding
context words.
5. Skip-gram Model – Predicts surrounding words given a target word.
6. GloVe (Global Vectors for Word Representation) – A different approach to word
embeddings based on matrix factorization of word co-occurrence statistics.
7. FastText – An improved version of Word2Vec that considers subword information.
8. Embedding Layer in TensorFlow – A layer that converts word indices into dense
vectors.
Class Task 1: Implement a Word2Vec Model in Python
Step-by-Step Code Explanation
1. Install Required Libraries
pip install gensim nltk
2. Import Necessary Libraries
import nltk
from nltk.tokenize import word_tokenize
from gensim.models import Word2Vec
nltk.download('punkt')
3. Prepare Text Data
text = "Deep learning for NLP enables computers to understand and process
human language efficiently."
# Tokenization
tokens = word_tokenize(text.lower()) # Convert to lowercase and tokenize
print("Tokens:", tokens)
4. Train the Word2Vec Model
sentences = [tokens] # Word2Vec requires a list of tokenized sentences
# Train a Word2Vec model using the Skip-gram method
model = Word2Vec(sentences, vector_size=100, window=5, min_count=1, sg=1)
# Save the model
model.save("word2vec_model.bin")
5. Test the Trained Word Embeddings
# Load the trained model
model = Word2Vec.load("word2vec_model.bin")
# Get vector for a word
vector = model.wv['language']
print("Word Vector for 'language':", vector[:10]) # Show first 10 values
# Find similar words
print("Most similar words to 'language':", model.wv.most_similar('language'))
Expected Output
The model generates word vectors.
It finds similar words based on learned embeddings.
Class Task 2: Use Word2Vec in a Text Classification Model
1. Install TensorFlow
pip install tensorflow
2. Build a Text Classification Model with an Embedding Layer
import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
# Sample text data
sentences = ["Deep learning is powerful",
"NLP helps computers understand text",
"Word embeddings improve accuracy",
"Neural networks process data"]
# Tokenize the text
tokenizer = Tokenizer()
tokenizer.fit_on_texts(sentences)
word_index = tokenizer.word_index
sequences = tokenizer.texts_to_sequences(sentences)
# Pad sequences to ensure uniform length
padded_sequences = pad_sequences(sequences, padding='post')
# Define a simple neural network
model = tf.keras.Sequential([
tf.keras.layers.Embedding(input_dim=len(word_index)+1, output_dim=8,
input_length=len(padded_sequences[0])),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss='binary_crossentropy', optimizer='adam',
metrics=['accuracy'])
model.summary()
Expected Output
A simple neural network that uses word embeddings for text classification.
Assignments
Assignment 1: Train a Word2Vec Model on a Larger Dataset
Task:
Use a larger text dataset (e.g., a book or Wikipedia articles) to train a Word2Vec
model.
Compare results between CBOW and Skip-gram models.
Sample Code Structure:
from gensim.models import Word2Vec
from nltk.tokenize import sent_tokenize, word_tokenize
# Load a larger text dataset
text_data = open("text_corpus.txt", encoding="utf-8").read()
sentences = [word_tokenize(sent.lower()) for sent in sent_tokenize(text_data)]
# Train Word2Vec using CBOW
cbow_model = Word2Vec(sentences, vector_size=100, window=5, min_count=2, sg=0)
# Train Word2Vec using Skip-gram
skipgram_model = Word2Vec(sentences, vector_size=100, window=5, min_count=2,
sg=1)
Expected Output
Word2Vec models trained on larger datasets, showing differences between CBOW and
Skip-gram methods.
Assignment 2: Compare Word Embeddings from Word2Vec, GloVe, and
FastText
Task:
Compare Word2Vec, GloVe, and FastText embeddings for a set of words.
Analyze the quality of word similarities.
Sample Code Structure:
from gensim.models import FastText
from gensim.models import KeyedVectors
# Train a FastText model
fasttext_model = FastText(sentences, vector_size=100, window=5, min_count=2)
# Load pre-trained GloVe embeddings
glove_model = KeyedVectors.load_word2vec_format("glove.6B.100d.txt",
binary=False)
# Compare word vectors
word = "language"
print("Word2Vec Similar Words:", model.wv.most_similar(word))
print("FastText Similar Words:", fasttext_model.wv.most_similar(word))
print("GloVe Similar Words:", glove_model.most_similar(word))
Expected Output
Comparison of similar words for Word2Vec, FastText, and GloVe.
Conclusion
This lesson covers deep learning for NLP, including Word2Vec, embeddings, and their
applications.
Week 9: Recurrent Neural Networks (RNNs) and Language
Models
Objective:
Understand the role of RNNs in language modeling.
Learn how RNNs handle sequential data and their challenges.
Implement a simple RNN for language modeling using Python.
Explore text generation using an RNN model.
Key Terms & Definitions
1. Recurrent Neural Network (RNN) – A type of neural network designed for sequential
data processing where previous outputs influence future computations.
2. Language Model – A model that predicts the next word in a sequence based on
previous words.
3. Vanishing Gradient Problem – A common issue in RNNs where long-range
dependencies become difficult to learn.
4. Long Short-Term Memory (LSTM) – A type of RNN that mitigates vanishing
gradient issues using memory cells.
5. Gated Recurrent Unit (GRU) – A simplified version of LSTMs, designed to be
computationally efficient.
6. Embedding Layer – Converts words into dense vectors for better processing by RNNs.
7. Sequence-to-Sequence (Seq2Seq) Model – A model architecture used in machine
translation, chatbots, and summarization.
8. Temperature (in text generation) – A parameter that controls randomness in RNN-
generated text.
Class Task 1: Implement a Simple RNN for Text Generation
Step-by-Step Code Explanation
1. Install Dependencies
pip install tensorflow numpy
2. Import Required Libraries
import tensorflow as tf
import numpy as np
import string
3. Prepare Training Data
text = "hello world welcome to recurrent neural networks"
chars = sorted(set(text)) # Unique characters
char_to_index = {c: i for i, c in enumerate(chars)}
index_to_char = {i: c for i, c in enumerate(chars)}
# Convert text into numerical sequence
encoded_text = [char_to_index[c] for c in text]
print("Encoded Text:", encoded_text)
4. Create Training Sequences
seq_length = 5
X, y = [], []
for i in range(len(encoded_text) - seq_length):
X.append(encoded_text[i:i+seq_length])
y.append(encoded_text[i+seq_length])
X = np.array(X)
y = np.array(y)
5. Build the RNN Model
model = tf.keras.Sequential([
tf.keras.layers.Embedding(input_dim=len(chars), output_dim=8,
input_length=seq_length),
tf.keras.layers.SimpleRNN(64, return_sequences=True),
tf.keras.layers.SimpleRNN(64),
tf.keras.layers.Dense(len(chars), activation='softmax')
])
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam',
metrics=['accuracy'])
model.summary()
6. Train the Model
model.fit(X, y, epochs=100, batch_size=8)
Expected Output
The model learns the character sequences in the text.
Class Task 2: Generate Text Using the Trained RNN
1. Function to Predict Next Character
def generate_text(seed_text, length):
for _ in range(length):
encoded_seed = [char_to_index[c] for c in seed_text]
encoded_seed = np.array(encoded_seed[-seq_length:]).reshape(1, -1)
prediction = model.predict(encoded_seed)
next_char = index_to_char[np.argmax(prediction)]
seed_text += next_char
return seed_text
2. Generate New Text
print(generate_text("hello", 20))
Expected Output
The model generates a sequence of characters based on the training text.
Assignments
Assignment 1: Implement an LSTM-Based Text Generator
Task:
Modify the RNN to use LSTMs instead of SimpleRNN.
Sample Code Structure:
model = tf.keras.Sequential([
tf.keras.layers.Embedding(input_dim=len(chars), output_dim=8,
input_length=seq_length),
tf.keras.layers.LSTM(64, return_sequences=True),
tf.keras.layers.LSTM(64),
tf.keras.layers.Dense(len(chars), activation='softmax')
])
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam',
metrics=['accuracy'])
model.fit(X, y, epochs=100, batch_size=8)
Expected Output
A text generator using LSTMs instead of RNNs.
Assignment 2: Train an RNN on a Larger Text Dataset
Task:
Train an RNN on a larger text dataset (e.g., Shakespeare’s text).
Sample Code Structure:
import tensorflow_datasets as tfds
# Load a text dataset
dataset = tfds.load("tiny_shakespeare", split="train", as_supervised=True)
text_data = ""
for text, _ in dataset.take(10000):
text_data += text.numpy().decode('utf-8')
print("Sample Text:", text_data[:500])
Expected Output
The model learns larger vocabulary and more complex text patterns.
Conclusion
This lesson covers RNNs and language modeling, including text generation using
SimpleRNN and LSTMs.
Week 10: CAT 2.
Week 11: Advanced Neural Networks and Attention Models
Objective:
Understand advanced neural network architectures such as transformers.
Learn how attention mechanisms work in deep learning.
Implement an attention mechanism using TensorFlow and Keras.
Apply attention to tasks such as natural language processing (NLP) and sequence
modeling.
Key Terms & Definitions
1. Attention Mechanism – A method that allows a model to focus on specific parts of an
input sequence when making predictions.
2. Self-Attention – A mechanism where each token in a sequence attends to all other
tokens.
3. Multi-Head Attention – A technique where multiple attention layers process input in
parallel to learn different representations.
4. Transformers – A deep learning model architecture that uses attention mechanisms to
process sequential data efficiently.
5. Query, Key, and Value (QKV) – The fundamental components of attention mechanisms
used to compute attention scores.
6. Softmax Function – A function that converts attention scores into probabilities.
7. TensorFlow/Keras – A deep learning framework used to build and train attention-based
models.
Class Task 1: Implement a Basic Attention Mechanism in
Python
Step-by-Step Code Explanation
1. Install Dependencies
Ensure you have TensorFlow installed:
pip install tensorflow numpy
2. Import Libraries
import tensorflow as tf
import numpy as np
3. Define the Attention Mechanism
class SimpleAttention(tf.keras.layers.Layer):
def __init__(self, units):
super(SimpleAttention, self).__init__()
self.W1 = tf.keras.layers.Dense(units)
self.W2 = tf.keras.layers.Dense(units)
self.V = tf.keras.layers.Dense(1)
def call(self, query, values):
query_with_time_axis = tf.expand_dims(query, 1)
score = self.V(tf.nn.tanh(self.W1(query_with_time_axis) +
self.W2(values)))
attention_weights = tf.nn.softmax(score, axis=1)
context_vector = attention_weights * values
context_vector = tf.reduce_sum(context_vector, axis=1)
return context_vector, attention_weights
4. Generate Sample Data and Apply Attention
query = tf.constant([[0.5, 1.0, 0.3]], dtype=tf.float32) # Query vector
values = tf.random.normal([10, 3]) # Random values (simulating a sequence of
embeddings)
attention_layer = SimpleAttention(10)
context_vector, attention_weights = attention_layer(query, values)
print("Context Vector:", context_vector.numpy())
print("Attention Weights:", attention_weights.numpy())
Expected Output
A context vector representing weighted information from the input sequence.
Attention weights showing how much importance is assigned to each input element.
Class Task 2: Implement Multi-Head Attention Using
TensorFlow
1. Define Multi-Head Attention Layer
class MultiHeadAttention(tf.keras.layers.Layer):
def __init__(self, d_model, num_heads):
super(MultiHeadAttention, self).__init__()
self.num_heads = num_heads
self.d_model = d_model
assert d_model % num_heads == 0 # Ensure division is even
self.depth = d_model // num_heads
self.Wq = tf.keras.layers.Dense(d_model)
self.Wk = tf.keras.layers.Dense(d_model)
self.Wv = tf.keras.layers.Dense(d_model)
self.dense = tf.keras.layers.Dense(d_model)
def split_heads(self, x, batch_size):
x = tf.reshape(x, (batch_size, -1, self.num_heads, self.depth))
return tf.transpose(x, perm=[0, 2, 1, 3])
def call(self, query, key, value):
batch_size = tf.shape(query)[0]
Q = self.split_heads(self.Wq(query), batch_size)
K = self.split_heads(self.Wk(key), batch_size)
V = self.split_heads(self.Wv(value), batch_size)
scores = tf.matmul(Q, K, transpose_b=True) /
tf.math.sqrt(tf.cast(self.depth, tf.float32))
attention_weights = tf.nn.softmax(scores, axis=-1)
output = tf.matmul(attention_weights, V)
output = tf.transpose(output, perm=[0, 2, 1, 3])
output = tf.reshape(output, (batch_size, -1, self.d_model))
return self.dense(output)
2. Apply Multi-Head Attention
query = tf.random.normal([2, 5, 64]) # Batch of 2, 5 tokens, 64 features
key = tf.random.normal([2, 5, 64])
value = tf.random.normal([2, 5, 64])
mha = MultiHeadAttention(d_model=64, num_heads=8)
output = mha(query, key, value)
print("Multi-Head Attention Output Shape:", output.shape)
Expected Output
A transformed output where attention is applied across multiple heads.
Assignments
Assignment 1: Implement Attention in an RNN-Based Sequence Model
Task:
Build an LSTM-based text generation model using an attention layer.
Sample Code Structure:
class AttentionLSTM(tf.keras.Model):
def __init__(self, vocab_size, embedding_dim, rnn_units):
super().__init__()
self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)
self.lstm = tf.keras.layers.LSTM(rnn_units, return_sequences=True,
return_state=True)
self.attention = SimpleAttention(rnn_units)
self.dense = tf.keras.layers.Dense(vocab_size)
def call(self, inputs, states):
x = self.embedding(inputs)
lstm_output, state_h, state_c = self.lstm(x, initial_state=states)
context_vector, _ = self.attention(state_h, lstm_output)
x = tf.concat([context_vector, state_h], axis=-1)
x = self.dense(x)
return x, (state_h, state_c)
Assignment 2: Implement Transformer Model Using TensorFlow
Task:
Implement a transformer-based text classification model using TensorFlow.
Sample Code Structure:
from tensorflow.keras.layers import Embedding, Dense, Dropout
from tensorflow.keras.models import Model
class TransformerBlock(tf.keras.layers.Layer):
def __init__(self, embed_dim, num_heads, ff_dim, rate=0.1):
super().__init__()
self.att = MultiHeadAttention(embed_dim, num_heads)
self.ffn = tf.keras.Sequential([Dense(ff_dim, activation="relu"),
Dense(embed_dim)])
self.dropout1 = Dropout(rate)
self.dropout2 = Dropout(rate)
def call(self, inputs):
attn_output = self.att(inputs, inputs, inputs)
out1 = self.dropout1(attn_output + inputs)
ffn_output = self.ffn(out1)
return self.dropout2(ffn_output + out1)
# Define a simple transformer model
class TransformerModel(Model):
def __init__(self, vocab_size, embed_dim, num_heads, ff_dim):
super().__init__()
self.embedding = Embedding(vocab_size, embed_dim)
self.transformer_block = TransformerBlock(embed_dim, num_heads,
ff_dim)
self.dense = Dense(1, activation="sigmoid")
def call(self, inputs):
x = self.embedding(inputs)
x = self.transformer_block(x)
return self.dense(x)
Conclusion
This lesson provides hands-on experience in implementing attention mechanisms, covering
basic attention, multi-head attention, and transformers using TensorFlow.
Week 12: End-to-End Speech Processing Models
Objective:
Understand the concept of end-to-end speech processing models.
Learn how to build a simple speech recognition system using Python.
Use pre-trained deep learning models (e.g., DeepSpeech, Wav2Vec 2.0).
Implement a basic neural network for speech recognition.
Key Terms & Definitions
1. End-to-End Speech Recognition – A method where a single neural network learns to
map audio input to text output without intermediate steps.
2. DeepSpeech – A deep learning model for automatic speech recognition (ASR) developed
by Mozilla.
3. Wav2Vec 2.0 – A self-supervised learning model by Facebook AI that improves ASR
performance by learning representations from raw audio data.
4. Mel-Frequency Cepstral Coefficients (MFCCs) – Features extracted from audio
signals to represent speech patterns.
5. Recurrent Neural Network (RNN) – A type of neural network suited for processing
sequential data, such as speech.
6. Connectionist Temporal Classification (CTC) – A loss function used in ASR to align
input audio frames with text.
Class Task 1: Build a Simple Speech-to-Text System Using
SpeechRecognition Library
Step-by-Step Code Explanation
1. Install Dependencies
Ensure you have the required libraries installed:
pip install SpeechRecognition pyaudio
2. Import Libraries
import speech_recognition as sr
3. Initialize the Recognizer
recognizer = sr.Recognizer()
4. Capture Audio from Microphone
with sr.Microphone() as source:
print("Say something...")
recognizer.adjust_for_ambient_noise(source) # Adjust for background noise
audio = recognizer.listen(source) # Capture speech
5. Convert Speech to Text
try:
text = recognizer.recognize_google(audio) # Use Google’s ASR
print("You said:", text)
except sr.UnknownValueError:
print("Sorry, could not understand the audio")
except sr.RequestError:
print("Error with the recognition service")
Expected Output
The system will listen to the user's speech and display the transcribed text.
Class Task 2: Using Pre-trained DeepSpeech Model for
Speech Recognition
1. Install Dependencies
pip install deepspeech numpy
Download the DeepSpeech model and scorer from:
https://github.com/mozilla/DeepSpeech/releases
2. Load the Model
import deepspeech
import numpy as np
import wave
model_file_path = 'deepspeech-0.9.3-models.pbmm' # Path to DeepSpeech model
scorer_file_path = 'deepspeech-0.9.3-models.scorer' # Path to scorer file
model = deepspeech.Model(model_file_path)
model.enableExternalScorer(scorer_file_path)
3. Load an Audio File
def read_wav_file(filename):
with wave.open(filename, 'rb') as wf:
rate = wf.getframerate()
frames = wf.getnframes()
audio = np.frombuffer(wf.readframes(frames), dtype=np.int16)
return rate, audio
audio_file = "sample_audio.wav" # Replace with your own .wav file
rate, audio = read_wav_file(audio_file)
4. Perform Speech Recognition
text = model.stt(audio)
print("Transcribed Text:", text)
Expected Output
The system will transcribe the speech from the given .wav file and display the text
output.
Assignments
Assignment 1: Build a Custom Speech Recognition Model Using Wav2Vec 2.0
Task:
Implement Wav2Vec 2.0 from Hugging Face’s transformers library to convert speech
to text.
Use a sample .wav file for transcription.
Sample Code Structure:
from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
import torch
import librosa
# Load pre-trained Wav2Vec2.0 model
processor = Wav2Vec2Processor.from_pretrained("facebook/wav2vec2-base-960h")
model = Wav2Vec2ForCTC.from_pretrained("facebook/wav2vec2-base-960h")
# Load an audio file
audio_file = "sample_audio.wav"
audio, rate = librosa.load(audio_file, sr=16000)
# Preprocess the audio
input_values = processor(audio, sampling_rate=rate,
return_tensors="pt").input_values
# Perform inference
with torch.no_grad():
logits = model(input_values).logits
# Decode the output
predicted_ids = torch.argmax(logits, dim=-1)
transcription = processor.batch_decode(predicted_ids)[0]
print("Transcribed Text:", transcription)
Expected Output
A transcribed text output from the .wav file.
Assignment 2: Implement a Simple Speech Command Classifier Using
TensorFlow
Task:
Train a simple deep learning model to classify speech commands like "yes," "no," "stop,"
etc.
Use TensorFlow/Keras with a small dataset.
Sample Code Structure:
import tensorflow as tf
import numpy as np
import librosa
import os
# Load dataset (assume we have 'yes' and 'no' audio samples)
dataset_path = "speech_commands_dataset/"
# Load audio files
def load_audio_files(directory):
data, labels = [], []
label_map = {"yes": 0, "no": 1} # Assign numerical labels
for label in label_map:
files = os.listdir(os.path.join(directory, label))
for file in files:
audio_path = os.path.join(directory, label, file)
audio, _ = librosa.load(audio_path, sr=16000)
data.append(audio)
labels.append(label_map[label])
return np.array(data), np.array(labels)
# Prepare dataset
X, y = load_audio_files(dataset_path)
X = np.expand_dims(X, axis=-1) # Add channel dimension
# Build a simple CNN model
model = tf.keras.models.Sequential([
tf.keras.layers.Conv1D(16, kernel_size=3, activation='relu',
input_shape=(16000, 1)),
tf.keras.layers.MaxPooling1D(pool_size=2),
tf.keras.layers.Conv1D(32, kernel_size=3, activation='relu'),
tf.keras.layers.MaxPooling1D(pool_size=2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(2, activation='softmax') # Two classes (yes/no)
])
# Compile and train the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(X, y, epochs=10, batch_size=16)
# Save the model
model.save("speech_command_model.h5")
Expected Output
A trained model that can classify speech commands.
Conclusion
This lesson provides students with hands-on experience in speech processing, covering simple
ASR using Python’s SpeechRecognition, DeepSpeech, and advanced deep learning models like
Wav2Vec 2.0 and TensorFlow-based classifiers.
Week 13: Issues and Architectures for NLP
Objective:
Understand key issues and challenges in NLP, such as ambiguity, bias, and
scalability.
Explore various NLP architectures, including RNNs, LSTMs, Transformers, and
BERT.
Research and analyze future trends in NLP like self-supervised learning, multilingual
models, and low-resource NLP.
Implement basic and advanced NLP models using Python and TensorFlow.
Key Terms & Definitions
1. Natural Language Processing (NLP) – A branch of AI that helps computers
understand, interpret, and generate human language.
2. Tokenization – Splitting text into meaningful units (words, subwords, or characters).
3. Word Embeddings – Representing words as numerical vectors (e.g., Word2Vec, GloVe,
BERT embeddings).
4. Recurrent Neural Networks (RNNs) – A deep learning model for processing sequential
data.
5. Long Short-Term Memory (LSTM) – A type of RNN that overcomes vanishing
gradient problems in NLP.
6. Transformers – A deep learning architecture that replaces RNNs for NLP tasks, using
self-attention for better context understanding.
7. BERT (Bidirectional Encoder Representations from Transformers) – A transformer-
based model that understands word context from both left and right directions.
8. GPT (Generative Pre-trained Transformer) – A model specialized in text generation
based on transformers.
9. Multilingual NLP – Models trained to process multiple languages simultaneously.
10. Ethical NLP – Addressing challenges such as bias, fairness, and interpretability in
language models.
Class Task 1: Implement Text Preprocessing and Word
Embeddings in Python
Step-by-Step Code Explanation
1. Install Dependencies
pip install nltk gensim
2. Import Required Libraries
import nltk
from nltk.tokenize import word_tokenize
from gensim.models import Word2Vec
nltk.download('punkt')
3. Tokenize and Preprocess Text
text = "Natural language processing enables computers to understand human
language."
tokens = word_tokenize(text.lower()) # Convert to lowercase and tokenize
print("Tokens:", tokens)
4. Train a Word2Vec Model
# Prepare training data
sentences = [tokens] # Word2Vec requires a list of tokenized sentences
model = Word2Vec(sentences, vector_size=100, window=5, min_count=1, workers=4)
# Get vector representation of a word
word_vector = model.wv['language']
print("Word Vector for 'language':", word_vector[:10]) # Show first 10 values
Expected Output
Tokenized text
Numerical word embeddings
Class Task 2: Implement a Transformer-Based Text
Classification Model
1. Install Transformers Library
pip install transformers torch datasets
2. Load Pre-Trained BERT Model
from transformers import BertTokenizer, BertForSequenceClassification
import torch
# Load tokenizer and model
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
model = BertForSequenceClassification.from_pretrained("bert-base-uncased",
num_labels=2)
3. Tokenize Input Sentence
sentence = "NLP is an exciting field of artificial intelligence."
inputs = tokenizer(sentence, return_tensors="pt", truncation=True,
padding=True)
4. Perform Text Classification
outputs = model(**inputs)
logits = outputs.logits
prediction = torch.argmax(logits, dim=1).item()
print("Predicted Class:", prediction)
Expected Output
The model will classify the input sentence into one of two classes (e.g.,
positive/negative sentiment).
Assignments
Assignment 1: Research on Future Trends in NLP
Task:
Write a report on future trends in NLP, covering:
o Self-Supervised Learning (SSL) (e.g., Wav2Vec, T5, BERT)
o Multilingual NLP (e.g., mBERT, XLM-R)
o Low-Resource NLP (training models with minimal data)
o Ethics & Bias in NLP
Expected Submission:
3–5 pages report with examples of real-world NLP applications.
Assignment 2: Implement a GPT-Based Text Generator
Task:
Use GPT-2 to generate text based on a prompt.
Sample Code Structure:
from transformers import GPT2LMHeadModel, GPT2Tokenizer
# Load GPT-2 tokenizer and model
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")
# Input text prompt
prompt = "The future of natural language processing is"
# Tokenize input
inputs = tokenizer(prompt, return_tensors="pt")
# Generate text
output = model.generate(**inputs, max_length=50)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print("Generated Text:", generated_text)
Expected Output
The model will generate text continuing the given prompt.
Conclusion
This lesson covers NLP issues, architectures, and future trends, along with hands-on
implementations of Word Embeddings, Transformers, and GPT-based text generation.