0% found this document useful (0 votes)
22 views36 pages

Week 11-14 With Week 1-14

it is documents about Python. python students can find it usefule
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views36 pages

Week 11-14 With Week 1-14

it is documents about Python. python students can find it usefule
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 36

BIT4133 NLP

All notes in a single document

Week: 1-4 (Summarized for revision)

Week 6- 14 Detailed for practice.

Week 1: Introduction to NLP and


Applications
Objective:
 Understand the fundamentals of NLP and its real-world applications.
 Explore core NLP tasks like speech recognition, machine translation, sentiment
analysis, and chatbots.
 Conduct research on real-world use cases of NLP.

Key Terms & Definitions


1. Natural Language Processing (NLP) – The field of AI that enables computers to
understand, interpret, and generate human language.
2. Tokenization – Splitting text into words (word tokenization) or sentences (sentence
tokenization).
3. Lemmatization – Converting words to their base or dictionary form (e.g., "running" →
"run").
4. Named Entity Recognition (NER) – Identifying names of people, locations, dates, etc.
in text.
5. Sentiment Analysis – Determining whether text expresses positive, negative, or
neutral sentiment.
6. Machine Translation (MT) – Automatically translating text from one language to
another.
7. Chatbots – AI-driven conversational agents that interact via text or voice.

Class Task 1: Implement a Simple NLP Pipeline


Step-by-Step Code Explanation

1. Install Required Libraries


pip install nltk spacy
python -m spacy download en_core_web_sm

2. Import Required Libraries


import spacy

# Load NLP model


nlp = spacy.load("en_core_web_sm")

# Sample text
text = "OpenAI developed ChatGPT in 2023. It is widely used in the tech
industry."

# Process text
doc = nlp(text)

# Tokenization, Named Entity Recognition (NER), and POS tagging


for token in doc:
print(f"Token: {token.text}, POS: {token.pos_}")

for ent in doc.ents:


print(f"Entity: {ent.text}, Label: {ent.label_}")

Expected Output

 Tokens and POS tags


 Extracted entities (e.g., OpenAI, ChatGPT, 2023)

Assignment: Research Real-World NLP Applications


 Research three real-world NLP applications (e.g., Google Translate, Alexa,
ChatGPT).
 Write a one-page report on their use cases, challenges, and impact.

Week 2: N-gram Language Models and Part-


of-Speech Tagging
Objective:
 Learn about N-gram language models for predicting the next word in a sequence.
 Implement an N-gram model for text generation.
 Perform Part-of-Speech (POS) tagging using NLTK.
Key Terms & Definitions
1. N-gram Model – A probabilistic model that predicts the next word based on the
previous N-1 words.
2. Unigram, Bigram, Trigram – N-gram models where N = 1, 2, 3 respectively.
3. POS Tagging – Assigning grammatical tags (noun, verb, etc.) to words.

Class Task: Implement an N-gram Model in Python


1. Install NLTK
pip install nltk

2. Implement a Bigram Model


import nltk
from nltk import bigrams
from nltk.probability import FreqDist

nltk.download('reuters')
from nltk.corpus import reuters

# Load a sample text


words = list(reuters.words(categories='crude'))

# Generate bigrams
bi_grams = list(bigrams(words))

# Compute frequency distribution


fdist = FreqDist(bi_grams)

# Print most common bigrams


print(fdist.most_common(10))

Expected Output

 List of the most frequent bigrams (word pairs) in the Reuters dataset.

Assignment: POS Tagging Using NLTK


 Use NLTK’s POS tagger to tag words in a sample sentence.
 Analyze the output and describe the most common POS tags.
Week 3: Hidden Markov Models (HMM) and
Sequence Labeling
Objective:
 Understand Hidden Markov Models (HMMs) for sequence-based classification.
 Implement a POS tagger using HMMs.

Key Terms & Definitions


1. Hidden Markov Model (HMM) – A probabilistic model used for sequence prediction
tasks (e.g., POS tagging, speech recognition).
2. Transition Probabilities – The probability of moving from one hidden state (POS tag)
to another.
3. Emission Probabilities – The probability of a word appearing given a POS tag.

Class Task: Implement a Simple HMM POS Tagger


1. Install Required Libraries
pip install hmmlearn nltk

2. Train an HMM for POS Tagging


import nltk
from nltk.tag import hmm

nltk.download('treebank')

# Load dataset
tagged_sentences = nltk.corpus.treebank.tagged_sents()

# Train HMM POS tagger


trainer = hmm.HiddenMarkovModelTrainer()
hmm_tagger = trainer.train(tagged_sentences[:3000])

# Test on a new sentence


test_sentence = nltk.word_tokenize("The stock market crashed yesterday.")
tagged_sentence = hmm_tagger.tag(test_sentence)

print(tagged_sentence)

Expected Output

 Tagged words with their respective POS labels.


Assignment: Implement Named Entity Recognition Using
HMMs
 Modify the HMM model to detect named entities instead of POS tags.

Week 4: Syntactic and Semantic Analysis


Objective:
 Perform syntax analysis using dependency parsing.
 Understand semantic role labeling in NLP.

Key Terms & Definitions


1. Syntax Analysis – Analyzing the grammatical structure of a sentence.
2. Dependency Parsing – Understanding relationships between words (subject, object,
verb).
3. Semantic Role Labeling (SRL) – Identifying the role of words in a sentence (who did
what to whom).

Class Task: Analyze Syntax of a Sentence Using spaCy


1. Install spaCy
pip install spacy
python -m spacy download en_core_web_sm

2. Implement Dependency Parsing


import spacy

nlp = spacy.load("en_core_web_sm")

sentence = "The cat sat on the mat."

doc = nlp(sentence)

for token in doc:


print(f"Word: {token.text}, Dependency: {token.dep_}, Head:
{token.head.text}")

Expected Output
 Dependency relationships between words (e.g., subject, verb, object).

Assignment: Perform Semantic Role Labeling (SRL)


 Use a pre-trained SRL model (e.g., AllenNLP) to label roles in a sentence.

Week 5- CAT 1

Week 6: Information Extraction (IE) and Named Entity


Recognition (NER)
Objective:

 Understand Information Extraction (IE) and its components (Named Entity


Recognition, Relation Extraction, and Event Extraction).
 Learn about Named Entity Recognition (NER) and its applications in NLP.
 Develop a NER system using Python and spaCy/NLTK.
 Extract entities, relationships, and facts from unstructured text data.

Key Terms & Definitions


1. Information Extraction (IE) – The process of automatically extracting structured data
(entities, relations, facts) from unstructured text.
2. Named Entity Recognition (NER) – Identifying and classifying named entities
(persons, locations, organizations, dates, etc.) in text.
3. Relation Extraction (RE) – Identifying relationships between extracted entities (e.g.,
"Barack Obama" is the "President of the USA").
4. Event Extraction – Identifying important events from text (e.g., “COVID-19 pandemic
started in 2019”).
5. Tokenization – Splitting text into words or sentences for analysis.
6. Part-of-Speech (POS) Tagging – Identifying the grammatical role of words (noun,
verb, adjective, etc.).
7. Dependency Parsing – Analyzing the grammatical structure of a sentence.

Class Task 1: Implement Named Entity Recognition (NER)


Using spaCy
Step-by-Step Code Explanation

1. Install Required Libraries


pip install spacy
python -m spacy download en_core_web_sm

2. Import Required Libraries


import spacy

# Load the English NLP model


nlp = spacy.load("en_core_web_sm")

3. Apply Named Entity Recognition (NER) on Sample Text


# Sample text
text = "Elon Musk founded SpaceX in 2002. Tesla is headquartered in Palo Alto,
California."

# Process text
doc = nlp(text)

# Extract and print named entities


for ent in doc.ents:
print(f"Entity: {ent.text}, Label: {ent.label_}")

Expected Output

 Extracted entities and their types:


 Entity: Elon Musk, Label: PERSON
 Entity: SpaceX, Label: ORG
 Entity: 2002, Label: DATE
 Entity: Tesla, Label: ORG
 Entity: Palo Alto, Label: GPE
 Entity: California, Label: GPE

Class Task 2: Implement a Custom Information Extraction


System
1. Extract Entities and Their Relations from Text
# Import dependency parser
for token in doc:
print(f"Word: {token.text}, POS: {token.pos_}, Dependency: {token.dep_},
Head: {token.head.text}")

2. Identify Relationships Between Entities


import networkx as nx
import matplotlib.pyplot as plt

# Create a graph for relationships


graph = nx.DiGraph()

for token in doc:


if token.dep_ in ("nsubj", "dobj", "pobj"): # Identify subjects and
objects
graph.add_edge(token.head.text, token.text)

# Visualize the entity relationships


plt.figure(figsize=(6, 4))
nx.draw(graph, with_labels=True, node_color='lightblue', edge_color='gray')
plt.show()

Expected Output

 A graph visualization of relationships between words.

Assignments
Assignment 1: Train a Custom NER Model on a Domain-Specific Dataset

Task:

 Use spaCy to fine-tune an NER model for extracting entities from medical or legal
documents.

Sample Code Structure:


import spacy
from spacy.training import Example

nlp = spacy.blank("en") # Create a blank English NLP model


ner = nlp.add_pipe("ner") # Add Named Entity Recognizer

# Define custom training data


TRAIN_DATA = [
("Apple is based in Cupertino.", {"entities": [(0, 5, "ORG"), (16, 25,
"GPE")]}),
("Microsoft was founded by Bill Gates.", {"entities": [(0, 9, "ORG"), (24,
34, "PERSON")]}),
]

# Train the model


optimizer = nlp.begin_training()
for itn in range(10):
losses = {}
for text, annotations in TRAIN_DATA:
example = Example.from_dict(nlp.make_doc(text), annotations)
nlp.update([example], losses=losses)
print(f"Losses at iteration {itn}: {losses}")

Expected Output

 A custom-trained NER model capable of recognizing specific entities.

Assignment 2: Extract Information from News Articles or Wikipedia Pages

Task:

 Use BeautifulSoup or Newspaper3k to extract text from a news article and apply NER.

Sample Code Structure:


from newspaper import Article

# Extract text from a news article


url = "https://www.bbc.com/news/world-us-canada-66080367"
article = Article(url)
article.download()
article.parse()

# Process article text


doc = nlp(article.text)

# Print named entities


for ent in doc.ents:
print(f"Entity: {ent.text}, Label: {ent.label_}")

Expected Output

 Named entities extracted from real-world news data.

Conclusion

This lesson covers Information Extraction (IE) and Named Entity Recognition (NER) using
Python and spaCy, including entity recognition, relationship extraction, and custom model
training.

Week 7: Machine Translation and Statistical Models


Objective:

 Understand machine translation (MT) and its types (Rule-based, Statistical, and
Neural MT).
 Learn about Statistical Machine Translation (SMT) and Neural Machine Translation
(NMT).
 Implement and experiment with translation models in Python.
 Use seq2seq models with attention for improved translation performance.

Key Terms & Definitions


1. Machine Translation (MT) – The use of software to automatically translate text
between languages.
2. Rule-Based Machine Translation (RBMT) – Uses linguistic rules and dictionaries
for translation.
3. Statistical Machine Translation (SMT) – Uses probability-based models to generate
translations.
4. Neural Machine Translation (NMT) – Uses deep learning models (RNNs, LSTMs,
Transformers) for translation.
5. Seq2Seq Model – A deep learning model that processes sequences (input → encoder →
decoder → output).
6. Attention Mechanism – A technique that helps NMT models focus on relevant words
during translation.
7. BLEU Score – A metric for evaluating translation quality by comparing machine-
generated translations to human translations.

Class Task 1: Implement a Simple Statistical Machine


Translation Model
Step-by-Step Code Explanation

1. Install Required Libraries


pip install nltk

2. Import Required Libraries


import nltk
from nltk.translate.ibm1 import IBMModel1
from nltk.translate.api import AlignedSent
nltk.download('comtrans')

3. Load Parallel Sentences for Training


from nltk.corpus import comtrans

# Load English-French sentence pairs


english_sents = comtrans.aligned_sents('alignment-en-fr.txt')[:1000]

4. Train an IBM Model 1 for Word Alignment


aligned_sentences = [AlignedSent(pair.words, pair.mots) for pair in
english_sents]

# Train IBM Model 1


ibm1 = IBMModel1(aligned_sentences, 10) # Train for 10 iterations

5. Translate a Simple Sentence


test_sentence = ["the", "house", "is", "blue"]
translation = [ibm1.translation_table[word].max() for word in test_sentence]
print("Translated Sentence:", " ".join(translation))

Expected Output

 The model provides a simple English-to-French word translation.

Class Task 2: Implement a Simple Neural Machine


Translation Model (Seq2Seq)
1. Install TensorFlow
pip install tensorflow

2. Import Necessary Libraries


import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
import numpy as np

3. Define Training Data


# Sample English-French sentence pairs
english_sentences = ["hello", "how are you", "good morning", "thank you"]
french_sentences = ["bonjour", "comment ça va", "bon matin", "merci"]
# Tokenize English and French sentences
tokenizer_en = Tokenizer()
tokenizer_fr = Tokenizer()

tokenizer_en.fit_on_texts(english_sentences)
tokenizer_fr.fit_on_texts(french_sentences)

sequences_en = tokenizer_en.texts_to_sequences(english_sentences)
sequences_fr = tokenizer_fr.texts_to_sequences(french_sentences)

# Pad sequences for uniform length


sequences_en = pad_sequences(sequences_en, padding='post')
sequences_fr = pad_sequences(sequences_fr, padding='post')

# Convert to NumPy arrays


X = np.array(sequences_en)
y = np.array(sequences_fr)

4. Build a Simple Seq2Seq Translation Model


# Define model
model = tf.keras.Sequential([
tf.keras.layers.Embedding(input_dim=len(tokenizer_en.word_index) + 1,
output_dim=8),
tf.keras.layers.LSTM(64, return_sequences=True),
tf.keras.layers.LSTM(64),
tf.keras.layers.Dense(len(tokenizer_fr.word_index) + 1,
activation='softmax')
])

model.compile(loss='sparse_categorical_crossentropy', optimizer='adam',
metrics=['accuracy'])
model.summary()

5. Train the Model


model.fit(X, y, epochs=100, batch_size=8)

6. Translate a Sentence
def translate_sentence(sentence):
seq = tokenizer_en.texts_to_sequences([sentence])
seq = pad_sequences(seq, maxlen=X.shape[1], padding='post')
prediction = model.predict(seq)
translated_seq = np.argmax(prediction, axis=-1)
words = [tokenizer_fr.index_word[i] for i in translated_seq if i != 0]
return " ".join(words)

print("English: hello")
print("French:", translate_sentence("hello"))

Expected Output
 The model translates simple English phrases into French.

Assignments
Assignment 1: Train a Transformer-Based Translation Model

Task:

 Implement a Transformer-based translation model using TensorFlow and the


Hugging Face library.

Sample Code Structure:


from transformers import MarianMTModel, MarianTokenizer

# Load a pre-trained English-French translation model


model_name = "Helsinki-NLP/opus-mt-en-fr"
tokenizer = MarianTokenizer.from_pretrained(model_name)
model = MarianMTModel.from_pretrained(model_name)

def translate_text(text):
inputs = tokenizer(text, return_tensors="pt", padding=True,
truncation=True)
translated = model.generate(**inputs)
return tokenizer.decode(translated[0], skip_special_tokens=True)

print(translate_text("How are you?"))

Expected Output

 The model translates longer and more complex sentences.

Assignment 2: Evaluate Machine Translation Performance with BLEU Score

Task:

 Implement BLEU Score evaluation to compare machine-generated translations with


reference translations.

Sample Code Structure:


from nltk.translate.bleu_score import sentence_bleu

# Reference and candidate translations


reference = [["bonjour", "comment", "ça", "va"]]
candidate = ["bonjour", "ça", "va"]
# Compute BLEU score
score = sentence_bleu(reference, candidate)
print("BLEU Score:", score)

Expected Output

 A BLEU score indicating translation accuracy.

Conclusion

This lesson covers machine translation using statistical and neural models, including IBM
Model 1, Seq2Seq models, and Transformers.

Week 8: Introduction to Deep Learning for NLP


Objective:

 Understand the basics of Deep Learning in NLP.


 Learn about word embeddings and why they are important.
 Train a Word2Vec model to learn vector representations of words.
 Explore applications of word embeddings in NLP tasks like text classification and
sentiment analysis.

Key Terms & Definitions


1. Deep Learning for NLP – The use of deep neural networks to process and understand
natural language.
2. Word Embeddings – Represent words as dense numerical vectors in a multi-
dimensional space.
3. Word2Vec – A deep learning model that learns word embeddings using CBOW
(Continuous Bag of Words) or Skip-gram architectures.
4. CBOW (Continuous Bag of Words) – Predicts a target word based on surrounding
context words.
5. Skip-gram Model – Predicts surrounding words given a target word.
6. GloVe (Global Vectors for Word Representation) – A different approach to word
embeddings based on matrix factorization of word co-occurrence statistics.
7. FastText – An improved version of Word2Vec that considers subword information.
8. Embedding Layer in TensorFlow – A layer that converts word indices into dense
vectors.

Class Task 1: Implement a Word2Vec Model in Python


Step-by-Step Code Explanation

1. Install Required Libraries


pip install gensim nltk

2. Import Necessary Libraries


import nltk
from nltk.tokenize import word_tokenize
from gensim.models import Word2Vec
nltk.download('punkt')

3. Prepare Text Data


text = "Deep learning for NLP enables computers to understand and process
human language efficiently."

# Tokenization
tokens = word_tokenize(text.lower()) # Convert to lowercase and tokenize
print("Tokens:", tokens)

4. Train the Word2Vec Model


sentences = [tokens] # Word2Vec requires a list of tokenized sentences

# Train a Word2Vec model using the Skip-gram method


model = Word2Vec(sentences, vector_size=100, window=5, min_count=1, sg=1)

# Save the model


model.save("word2vec_model.bin")

5. Test the Trained Word Embeddings


# Load the trained model
model = Word2Vec.load("word2vec_model.bin")

# Get vector for a word


vector = model.wv['language']
print("Word Vector for 'language':", vector[:10]) # Show first 10 values

# Find similar words


print("Most similar words to 'language':", model.wv.most_similar('language'))
Expected Output

 The model generates word vectors.


 It finds similar words based on learned embeddings.

Class Task 2: Use Word2Vec in a Text Classification Model


1. Install TensorFlow
pip install tensorflow

2. Build a Text Classification Model with an Embedding Layer


import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

# Sample text data


sentences = ["Deep learning is powerful",
"NLP helps computers understand text",
"Word embeddings improve accuracy",
"Neural networks process data"]

# Tokenize the text


tokenizer = Tokenizer()
tokenizer.fit_on_texts(sentences)
word_index = tokenizer.word_index
sequences = tokenizer.texts_to_sequences(sentences)

# Pad sequences to ensure uniform length


padded_sequences = pad_sequences(sequences, padding='post')

# Define a simple neural network


model = tf.keras.Sequential([
tf.keras.layers.Embedding(input_dim=len(word_index)+1, output_dim=8,
input_length=len(padded_sequences[0])),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(1, activation='sigmoid')
])

model.compile(loss='binary_crossentropy', optimizer='adam',
metrics=['accuracy'])
model.summary()

Expected Output

 A simple neural network that uses word embeddings for text classification.
Assignments
Assignment 1: Train a Word2Vec Model on a Larger Dataset

Task:

 Use a larger text dataset (e.g., a book or Wikipedia articles) to train a Word2Vec
model.
 Compare results between CBOW and Skip-gram models.

Sample Code Structure:


from gensim.models import Word2Vec
from nltk.tokenize import sent_tokenize, word_tokenize

# Load a larger text dataset


text_data = open("text_corpus.txt", encoding="utf-8").read()
sentences = [word_tokenize(sent.lower()) for sent in sent_tokenize(text_data)]

# Train Word2Vec using CBOW


cbow_model = Word2Vec(sentences, vector_size=100, window=5, min_count=2, sg=0)

# Train Word2Vec using Skip-gram


skipgram_model = Word2Vec(sentences, vector_size=100, window=5, min_count=2,
sg=1)

Expected Output

 Word2Vec models trained on larger datasets, showing differences between CBOW and
Skip-gram methods.

Assignment 2: Compare Word Embeddings from Word2Vec, GloVe, and


FastText

Task:

 Compare Word2Vec, GloVe, and FastText embeddings for a set of words.


 Analyze the quality of word similarities.

Sample Code Structure:


from gensim.models import FastText
from gensim.models import KeyedVectors

# Train a FastText model


fasttext_model = FastText(sentences, vector_size=100, window=5, min_count=2)

# Load pre-trained GloVe embeddings


glove_model = KeyedVectors.load_word2vec_format("glove.6B.100d.txt",
binary=False)

# Compare word vectors


word = "language"
print("Word2Vec Similar Words:", model.wv.most_similar(word))
print("FastText Similar Words:", fasttext_model.wv.most_similar(word))
print("GloVe Similar Words:", glove_model.most_similar(word))

Expected Output

 Comparison of similar words for Word2Vec, FastText, and GloVe.

Conclusion

This lesson covers deep learning for NLP, including Word2Vec, embeddings, and their
applications.
Week 9: Recurrent Neural Networks (RNNs) and Language
Models
Objective:

 Understand the role of RNNs in language modeling.


 Learn how RNNs handle sequential data and their challenges.
 Implement a simple RNN for language modeling using Python.
 Explore text generation using an RNN model.

Key Terms & Definitions


1. Recurrent Neural Network (RNN) – A type of neural network designed for sequential
data processing where previous outputs influence future computations.
2. Language Model – A model that predicts the next word in a sequence based on
previous words.
3. Vanishing Gradient Problem – A common issue in RNNs where long-range
dependencies become difficult to learn.
4. Long Short-Term Memory (LSTM) – A type of RNN that mitigates vanishing
gradient issues using memory cells.
5. Gated Recurrent Unit (GRU) – A simplified version of LSTMs, designed to be
computationally efficient.
6. Embedding Layer – Converts words into dense vectors for better processing by RNNs.
7. Sequence-to-Sequence (Seq2Seq) Model – A model architecture used in machine
translation, chatbots, and summarization.
8. Temperature (in text generation) – A parameter that controls randomness in RNN-
generated text.

Class Task 1: Implement a Simple RNN for Text Generation


Step-by-Step Code Explanation

1. Install Dependencies
pip install tensorflow numpy

2. Import Required Libraries


import tensorflow as tf
import numpy as np
import string

3. Prepare Training Data


text = "hello world welcome to recurrent neural networks"
chars = sorted(set(text)) # Unique characters
char_to_index = {c: i for i, c in enumerate(chars)}
index_to_char = {i: c for i, c in enumerate(chars)}

# Convert text into numerical sequence


encoded_text = [char_to_index[c] for c in text]
print("Encoded Text:", encoded_text)

4. Create Training Sequences


seq_length = 5
X, y = [], []

for i in range(len(encoded_text) - seq_length):


X.append(encoded_text[i:i+seq_length])
y.append(encoded_text[i+seq_length])

X = np.array(X)
y = np.array(y)

5. Build the RNN Model


model = tf.keras.Sequential([
tf.keras.layers.Embedding(input_dim=len(chars), output_dim=8,
input_length=seq_length),
tf.keras.layers.SimpleRNN(64, return_sequences=True),
tf.keras.layers.SimpleRNN(64),
tf.keras.layers.Dense(len(chars), activation='softmax')
])

model.compile(loss='sparse_categorical_crossentropy', optimizer='adam',
metrics=['accuracy'])
model.summary()

6. Train the Model


model.fit(X, y, epochs=100, batch_size=8)

Expected Output

 The model learns the character sequences in the text.


Class Task 2: Generate Text Using the Trained RNN
1. Function to Predict Next Character
def generate_text(seed_text, length):
for _ in range(length):
encoded_seed = [char_to_index[c] for c in seed_text]
encoded_seed = np.array(encoded_seed[-seq_length:]).reshape(1, -1)

prediction = model.predict(encoded_seed)
next_char = index_to_char[np.argmax(prediction)]

seed_text += next_char
return seed_text

2. Generate New Text


print(generate_text("hello", 20))

Expected Output

 The model generates a sequence of characters based on the training text.

Assignments
Assignment 1: Implement an LSTM-Based Text Generator

Task:

 Modify the RNN to use LSTMs instead of SimpleRNN.

Sample Code Structure:


model = tf.keras.Sequential([
tf.keras.layers.Embedding(input_dim=len(chars), output_dim=8,
input_length=seq_length),
tf.keras.layers.LSTM(64, return_sequences=True),
tf.keras.layers.LSTM(64),
tf.keras.layers.Dense(len(chars), activation='softmax')
])

model.compile(loss='sparse_categorical_crossentropy', optimizer='adam',
metrics=['accuracy'])
model.fit(X, y, epochs=100, batch_size=8)

Expected Output
 A text generator using LSTMs instead of RNNs.

Assignment 2: Train an RNN on a Larger Text Dataset

Task:

 Train an RNN on a larger text dataset (e.g., Shakespeare’s text).

Sample Code Structure:


import tensorflow_datasets as tfds

# Load a text dataset


dataset = tfds.load("tiny_shakespeare", split="train", as_supervised=True)

text_data = ""
for text, _ in dataset.take(10000):
text_data += text.numpy().decode('utf-8')

print("Sample Text:", text_data[:500])

Expected Output

 The model learns larger vocabulary and more complex text patterns.

Conclusion

This lesson covers RNNs and language modeling, including text generation using
SimpleRNN and LSTMs.
Week 10: CAT 2.

Week 11: Advanced Neural Networks and Attention Models


Objective:

 Understand advanced neural network architectures such as transformers.


 Learn how attention mechanisms work in deep learning.
 Implement an attention mechanism using TensorFlow and Keras.
 Apply attention to tasks such as natural language processing (NLP) and sequence
modeling.

Key Terms & Definitions


1. Attention Mechanism – A method that allows a model to focus on specific parts of an
input sequence when making predictions.
2. Self-Attention – A mechanism where each token in a sequence attends to all other
tokens.
3. Multi-Head Attention – A technique where multiple attention layers process input in
parallel to learn different representations.
4. Transformers – A deep learning model architecture that uses attention mechanisms to
process sequential data efficiently.
5. Query, Key, and Value (QKV) – The fundamental components of attention mechanisms
used to compute attention scores.
6. Softmax Function – A function that converts attention scores into probabilities.
7. TensorFlow/Keras – A deep learning framework used to build and train attention-based
models.

Class Task 1: Implement a Basic Attention Mechanism in


Python
Step-by-Step Code Explanation

1. Install Dependencies

Ensure you have TensorFlow installed:

pip install tensorflow numpy


2. Import Libraries
import tensorflow as tf
import numpy as np

3. Define the Attention Mechanism


class SimpleAttention(tf.keras.layers.Layer):
def __init__(self, units):
super(SimpleAttention, self).__init__()
self.W1 = tf.keras.layers.Dense(units)
self.W2 = tf.keras.layers.Dense(units)
self.V = tf.keras.layers.Dense(1)

def call(self, query, values):


query_with_time_axis = tf.expand_dims(query, 1)
score = self.V(tf.nn.tanh(self.W1(query_with_time_axis) +
self.W2(values)))
attention_weights = tf.nn.softmax(score, axis=1)
context_vector = attention_weights * values
context_vector = tf.reduce_sum(context_vector, axis=1)
return context_vector, attention_weights

4. Generate Sample Data and Apply Attention


query = tf.constant([[0.5, 1.0, 0.3]], dtype=tf.float32) # Query vector
values = tf.random.normal([10, 3]) # Random values (simulating a sequence of
embeddings)

attention_layer = SimpleAttention(10)
context_vector, attention_weights = attention_layer(query, values)

print("Context Vector:", context_vector.numpy())


print("Attention Weights:", attention_weights.numpy())

Expected Output

 A context vector representing weighted information from the input sequence.


 Attention weights showing how much importance is assigned to each input element.

Class Task 2: Implement Multi-Head Attention Using


TensorFlow
1. Define Multi-Head Attention Layer
class MultiHeadAttention(tf.keras.layers.Layer):
def __init__(self, d_model, num_heads):
super(MultiHeadAttention, self).__init__()
self.num_heads = num_heads
self.d_model = d_model

assert d_model % num_heads == 0 # Ensure division is even


self.depth = d_model // num_heads

self.Wq = tf.keras.layers.Dense(d_model)
self.Wk = tf.keras.layers.Dense(d_model)
self.Wv = tf.keras.layers.Dense(d_model)

self.dense = tf.keras.layers.Dense(d_model)

def split_heads(self, x, batch_size):


x = tf.reshape(x, (batch_size, -1, self.num_heads, self.depth))
return tf.transpose(x, perm=[0, 2, 1, 3])

def call(self, query, key, value):


batch_size = tf.shape(query)[0]

Q = self.split_heads(self.Wq(query), batch_size)
K = self.split_heads(self.Wk(key), batch_size)
V = self.split_heads(self.Wv(value), batch_size)

scores = tf.matmul(Q, K, transpose_b=True) /


tf.math.sqrt(tf.cast(self.depth, tf.float32))
attention_weights = tf.nn.softmax(scores, axis=-1)

output = tf.matmul(attention_weights, V)
output = tf.transpose(output, perm=[0, 2, 1, 3])
output = tf.reshape(output, (batch_size, -1, self.d_model))

return self.dense(output)

2. Apply Multi-Head Attention


query = tf.random.normal([2, 5, 64]) # Batch of 2, 5 tokens, 64 features
key = tf.random.normal([2, 5, 64])
value = tf.random.normal([2, 5, 64])

mha = MultiHeadAttention(d_model=64, num_heads=8)


output = mha(query, key, value)
print("Multi-Head Attention Output Shape:", output.shape)

Expected Output

 A transformed output where attention is applied across multiple heads.

Assignments
Assignment 1: Implement Attention in an RNN-Based Sequence Model
Task:

 Build an LSTM-based text generation model using an attention layer.

Sample Code Structure:


class AttentionLSTM(tf.keras.Model):
def __init__(self, vocab_size, embedding_dim, rnn_units):
super().__init__()
self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)
self.lstm = tf.keras.layers.LSTM(rnn_units, return_sequences=True,
return_state=True)
self.attention = SimpleAttention(rnn_units)
self.dense = tf.keras.layers.Dense(vocab_size)

def call(self, inputs, states):


x = self.embedding(inputs)
lstm_output, state_h, state_c = self.lstm(x, initial_state=states)
context_vector, _ = self.attention(state_h, lstm_output)
x = tf.concat([context_vector, state_h], axis=-1)
x = self.dense(x)
return x, (state_h, state_c)

Assignment 2: Implement Transformer Model Using TensorFlow

Task:

 Implement a transformer-based text classification model using TensorFlow.

Sample Code Structure:


from tensorflow.keras.layers import Embedding, Dense, Dropout
from tensorflow.keras.models import Model

class TransformerBlock(tf.keras.layers.Layer):
def __init__(self, embed_dim, num_heads, ff_dim, rate=0.1):
super().__init__()
self.att = MultiHeadAttention(embed_dim, num_heads)
self.ffn = tf.keras.Sequential([Dense(ff_dim, activation="relu"),
Dense(embed_dim)])
self.dropout1 = Dropout(rate)
self.dropout2 = Dropout(rate)

def call(self, inputs):


attn_output = self.att(inputs, inputs, inputs)
out1 = self.dropout1(attn_output + inputs)
ffn_output = self.ffn(out1)
return self.dropout2(ffn_output + out1)

# Define a simple transformer model


class TransformerModel(Model):
def __init__(self, vocab_size, embed_dim, num_heads, ff_dim):
super().__init__()
self.embedding = Embedding(vocab_size, embed_dim)
self.transformer_block = TransformerBlock(embed_dim, num_heads,
ff_dim)
self.dense = Dense(1, activation="sigmoid")

def call(self, inputs):


x = self.embedding(inputs)
x = self.transformer_block(x)
return self.dense(x)

Conclusion

This lesson provides hands-on experience in implementing attention mechanisms, covering


basic attention, multi-head attention, and transformers using TensorFlow.
Week 12: End-to-End Speech Processing Models
Objective:

 Understand the concept of end-to-end speech processing models.


 Learn how to build a simple speech recognition system using Python.
 Use pre-trained deep learning models (e.g., DeepSpeech, Wav2Vec 2.0).
 Implement a basic neural network for speech recognition.

Key Terms & Definitions

1. End-to-End Speech Recognition – A method where a single neural network learns to


map audio input to text output without intermediate steps.
2. DeepSpeech – A deep learning model for automatic speech recognition (ASR) developed
by Mozilla.
3. Wav2Vec 2.0 – A self-supervised learning model by Facebook AI that improves ASR
performance by learning representations from raw audio data.
4. Mel-Frequency Cepstral Coefficients (MFCCs) – Features extracted from audio
signals to represent speech patterns.
5. Recurrent Neural Network (RNN) – A type of neural network suited for processing
sequential data, such as speech.
6. Connectionist Temporal Classification (CTC) – A loss function used in ASR to align
input audio frames with text.

Class Task 1: Build a Simple Speech-to-Text System Using


SpeechRecognition Library
Step-by-Step Code Explanation

1. Install Dependencies

Ensure you have the required libraries installed:

pip install SpeechRecognition pyaudio

2. Import Libraries
import speech_recognition as sr

3. Initialize the Recognizer


recognizer = sr.Recognizer()

4. Capture Audio from Microphone


with sr.Microphone() as source:
print("Say something...")
recognizer.adjust_for_ambient_noise(source) # Adjust for background noise
audio = recognizer.listen(source) # Capture speech

5. Convert Speech to Text


try:
text = recognizer.recognize_google(audio) # Use Google’s ASR
print("You said:", text)
except sr.UnknownValueError:
print("Sorry, could not understand the audio")
except sr.RequestError:
print("Error with the recognition service")

Expected Output

 The system will listen to the user's speech and display the transcribed text.

Class Task 2: Using Pre-trained DeepSpeech Model for


Speech Recognition
1. Install Dependencies
pip install deepspeech numpy

Download the DeepSpeech model and scorer from:


https://github.com/mozilla/DeepSpeech/releases

2. Load the Model


import deepspeech
import numpy as np
import wave

model_file_path = 'deepspeech-0.9.3-models.pbmm' # Path to DeepSpeech model


scorer_file_path = 'deepspeech-0.9.3-models.scorer' # Path to scorer file

model = deepspeech.Model(model_file_path)
model.enableExternalScorer(scorer_file_path)

3. Load an Audio File


def read_wav_file(filename):
with wave.open(filename, 'rb') as wf:
rate = wf.getframerate()
frames = wf.getnframes()
audio = np.frombuffer(wf.readframes(frames), dtype=np.int16)
return rate, audio

audio_file = "sample_audio.wav" # Replace with your own .wav file


rate, audio = read_wav_file(audio_file)

4. Perform Speech Recognition


text = model.stt(audio)
print("Transcribed Text:", text)

Expected Output

 The system will transcribe the speech from the given .wav file and display the text
output.

Assignments
Assignment 1: Build a Custom Speech Recognition Model Using Wav2Vec 2.0

Task:

 Implement Wav2Vec 2.0 from Hugging Face’s transformers library to convert speech
to text.
 Use a sample .wav file for transcription.

Sample Code Structure:


from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
import torch
import librosa

# Load pre-trained Wav2Vec2.0 model


processor = Wav2Vec2Processor.from_pretrained("facebook/wav2vec2-base-960h")
model = Wav2Vec2ForCTC.from_pretrained("facebook/wav2vec2-base-960h")

# Load an audio file


audio_file = "sample_audio.wav"
audio, rate = librosa.load(audio_file, sr=16000)

# Preprocess the audio


input_values = processor(audio, sampling_rate=rate,
return_tensors="pt").input_values

# Perform inference
with torch.no_grad():
logits = model(input_values).logits

# Decode the output


predicted_ids = torch.argmax(logits, dim=-1)
transcription = processor.batch_decode(predicted_ids)[0]
print("Transcribed Text:", transcription)

Expected Output

 A transcribed text output from the .wav file.

Assignment 2: Implement a Simple Speech Command Classifier Using


TensorFlow

Task:

 Train a simple deep learning model to classify speech commands like "yes," "no," "stop,"
etc.
 Use TensorFlow/Keras with a small dataset.

Sample Code Structure:


import tensorflow as tf
import numpy as np
import librosa
import os

# Load dataset (assume we have 'yes' and 'no' audio samples)


dataset_path = "speech_commands_dataset/"

# Load audio files


def load_audio_files(directory):
data, labels = [], []
label_map = {"yes": 0, "no": 1} # Assign numerical labels
for label in label_map:
files = os.listdir(os.path.join(directory, label))
for file in files:
audio_path = os.path.join(directory, label, file)
audio, _ = librosa.load(audio_path, sr=16000)
data.append(audio)
labels.append(label_map[label])
return np.array(data), np.array(labels)

# Prepare dataset
X, y = load_audio_files(dataset_path)
X = np.expand_dims(X, axis=-1) # Add channel dimension

# Build a simple CNN model


model = tf.keras.models.Sequential([
tf.keras.layers.Conv1D(16, kernel_size=3, activation='relu',
input_shape=(16000, 1)),
tf.keras.layers.MaxPooling1D(pool_size=2),
tf.keras.layers.Conv1D(32, kernel_size=3, activation='relu'),
tf.keras.layers.MaxPooling1D(pool_size=2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(2, activation='softmax') # Two classes (yes/no)
])

# Compile and train the model


model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(X, y, epochs=10, batch_size=16)

# Save the model


model.save("speech_command_model.h5")

Expected Output

 A trained model that can classify speech commands.

Conclusion

This lesson provides students with hands-on experience in speech processing, covering simple
ASR using Python’s SpeechRecognition, DeepSpeech, and advanced deep learning models like
Wav2Vec 2.0 and TensorFlow-based classifiers.
Week 13: Issues and Architectures for NLP
Objective:

 Understand key issues and challenges in NLP, such as ambiguity, bias, and
scalability.
 Explore various NLP architectures, including RNNs, LSTMs, Transformers, and
BERT.
 Research and analyze future trends in NLP like self-supervised learning, multilingual
models, and low-resource NLP.
 Implement basic and advanced NLP models using Python and TensorFlow.

Key Terms & Definitions


1. Natural Language Processing (NLP) – A branch of AI that helps computers
understand, interpret, and generate human language.
2. Tokenization – Splitting text into meaningful units (words, subwords, or characters).
3. Word Embeddings – Representing words as numerical vectors (e.g., Word2Vec, GloVe,
BERT embeddings).
4. Recurrent Neural Networks (RNNs) – A deep learning model for processing sequential
data.
5. Long Short-Term Memory (LSTM) – A type of RNN that overcomes vanishing
gradient problems in NLP.
6. Transformers – A deep learning architecture that replaces RNNs for NLP tasks, using
self-attention for better context understanding.
7. BERT (Bidirectional Encoder Representations from Transformers) – A transformer-
based model that understands word context from both left and right directions.
8. GPT (Generative Pre-trained Transformer) – A model specialized in text generation
based on transformers.
9. Multilingual NLP – Models trained to process multiple languages simultaneously.
10. Ethical NLP – Addressing challenges such as bias, fairness, and interpretability in
language models.

Class Task 1: Implement Text Preprocessing and Word


Embeddings in Python
Step-by-Step Code Explanation
1. Install Dependencies
pip install nltk gensim

2. Import Required Libraries


import nltk
from nltk.tokenize import word_tokenize
from gensim.models import Word2Vec
nltk.download('punkt')

3. Tokenize and Preprocess Text


text = "Natural language processing enables computers to understand human
language."
tokens = word_tokenize(text.lower()) # Convert to lowercase and tokenize
print("Tokens:", tokens)

4. Train a Word2Vec Model


# Prepare training data
sentences = [tokens] # Word2Vec requires a list of tokenized sentences
model = Word2Vec(sentences, vector_size=100, window=5, min_count=1, workers=4)

# Get vector representation of a word


word_vector = model.wv['language']
print("Word Vector for 'language':", word_vector[:10]) # Show first 10 values

Expected Output

 Tokenized text
 Numerical word embeddings

Class Task 2: Implement a Transformer-Based Text


Classification Model
1. Install Transformers Library
pip install transformers torch datasets

2. Load Pre-Trained BERT Model


from transformers import BertTokenizer, BertForSequenceClassification
import torch

# Load tokenizer and model


tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
model = BertForSequenceClassification.from_pretrained("bert-base-uncased",
num_labels=2)

3. Tokenize Input Sentence


sentence = "NLP is an exciting field of artificial intelligence."
inputs = tokenizer(sentence, return_tensors="pt", truncation=True,
padding=True)

4. Perform Text Classification


outputs = model(**inputs)
logits = outputs.logits
prediction = torch.argmax(logits, dim=1).item()
print("Predicted Class:", prediction)

Expected Output

 The model will classify the input sentence into one of two classes (e.g.,
positive/negative sentiment).

Assignments
Assignment 1: Research on Future Trends in NLP

Task:

 Write a report on future trends in NLP, covering:


o Self-Supervised Learning (SSL) (e.g., Wav2Vec, T5, BERT)
o Multilingual NLP (e.g., mBERT, XLM-R)
o Low-Resource NLP (training models with minimal data)
o Ethics & Bias in NLP

Expected Submission:

 3–5 pages report with examples of real-world NLP applications.

Assignment 2: Implement a GPT-Based Text Generator

Task:

 Use GPT-2 to generate text based on a prompt.


Sample Code Structure:
from transformers import GPT2LMHeadModel, GPT2Tokenizer

# Load GPT-2 tokenizer and model


tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")

# Input text prompt


prompt = "The future of natural language processing is"

# Tokenize input
inputs = tokenizer(prompt, return_tensors="pt")

# Generate text
output = model.generate(**inputs, max_length=50)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

print("Generated Text:", generated_text)

Expected Output

 The model will generate text continuing the given prompt.

Conclusion
This lesson covers NLP issues, architectures, and future trends, along with hands-on
implementations of Word Embeddings, Transformers, and GPT-based text generation.

You might also like