0% found this document useful (0 votes)

12 views8 pages

Computational Intelligence Endsem

The document outlines various text representation techniques in machine learning, including Bag of Words (BoW), TF-IDF, Word2Vec, GloVe, and BERT, detailing their methodologies, advantages, and applications. It also discusses the Seq2Seq model for neural machine translation and evaluation metrics like BLEU and BERT Score. Additionally, it covers Neural Style Transfer and highlights the benefits of BERT over traditional language models.

Uploaded by

chinmaypathak.1907

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views8 pages

Computational Intelligence Endsem

Uploaded by

chinmaypathak.1907

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Unit 5

Bag of Words (BoW)

- Bag of Words is a simple and commonly used method to represent text data in machine
learning.
- It creates a vocabulary of all unique words from a collection of documents and represents
each document as a vector based on word frequency.

How it works:
- It ignores grammar and word order.
- It only considers whether known words occur in the document and how often.
- Each document is represented as a vector.

Example:
Let’s say we have two sentences:
- Doc1: “I like NLP”
- Doc2: “NLP is fun”

Vocabulary = [I, like, NLP, is, fun]

BoW Vectors:
- Doc1: [1, 1, 1, 0, 0]
- Doc2: [0, 0, 1, 1, 1]

Advantages:
- Simple and easy to implement.
- Works well for small datasets.

Describe the TF-IDF (Term Frequency-Inverse Document Frequency)

weighting scheme and its significance in text representation?

- TF-IDF is a statistical measure that reflects how important a word is to a document in a

collection by combining its frequency in the document and its rarity across the corpus.
- It is the product of two components:

1. Term Frequency (TF)

- Measures how frequently a term occurs in a document.

- Formula:
- Purpose: Words appearing more times in a document are more relevant to that document.

2. Inverse Document Frequency (IDF)

- Measures how unique or rare a term is across all documents in the corpus.
- Formula:

- Purpose: Reduces the weight of common terms (e.g., "is", "the") and increases the
importance of rare or meaningful terms.

3. TF-IDF Score

- Formula:

Interpretation: A high TF-IDF score indicates a word is frequent in a specific document but rare
across the corpus — hence, it's important for that document.

Significance in Text Representation

1. Helps convert textual data into numerical vectors, suitable for machine learning models.
2. Reduces the influence of common words and highlights meaningful terms.
3. Commonly used in text classification, information retrieval, clustering, and search
engines.
4. Efficient and simple compared to deep learning methods for small and medium-sized
datasets.

Word2Vec

- Word2Vec is a neural network-based technique that converts words into dense vector
representations where semantically similar words are closer in the vector space.
- Word2Vec is a predictive model, it predicts surrounding words based on the current word (or
vice versa).
- It trains faster on smaller datasets.
- Uses low memory because it doesn’t store a full matrix.
- Word2Vec is famously used in the Google News Word2Vec Model.

Example:
"Paris" and "France" will be closer in vector space than "Paris" and "banana".

Advantages:
- Word2Vec can be applied to different types of text data (like news articles, social media posts,
etc.) and still learn meaningful representations.
- Dense and low-dimensional embeddings.

GloVe (Global Vectors)

- GloVe is a word embedding technique that uses global word co-occurrence statistics from a
corpus. It captures how often words appear together in the entire corpus.
- GloVe is a count-based model — it builds a co-occurrence matrix and learns embeddings from
it.
- It is efficient for large datasets but needs to compute a matrix first.
- Uses more memory since it stores and processes a large matrix during training.
- GloVe is developed by Stanford and used in Stanford’s GloVe model.

Example:
Words like “king” and “queen” will have similar vectors because they appear in similar contexts.

Advantages:
- Produces meaningful and consistent word vectors.
- Useful for many NLP applications.
- Works well with large datasets.

Compare and contrast Word2Vec and GloVe in terms of how they generate
word embeddings?

Aspect Word2Vec GloVe (Global Vectors)

Main Idea Learn word meanings from Learn word meanings from how often
context (neighbors) in a words appear together in a large text.
sentence.

Model Type Predictive – it predicts Count-based – it counts how often words

surrounding words based on the appear together in a matrix, then learns
current word (or vice versa). embeddings from that.

Focus Area Local context – looks at a small Global context – considers entire
window of words nearby. text/statistics.
Training Faster on smaller datasets. Efficient for large datasets (requires
Speed matrix computation first).

Memory Low, since it doesn’t store a full Higher, as it stores a large matrix during
Usage matrix. training.

Use Case Google News Word2Vec Model Stanford’s GloVe

Neural Word Embedding

- Unlike static methods, neural word embeddings like BERT generate contextualized
representations, meaning the same word will have different embeddings based on the sentence.

Example:
“He went to the bank to deposit money.”
“He sat by the river bank.”

The word “bank” has different meanings and will get different embeddings in BERT.

Advantages:
- Captures context and semantics accurately.
- Better for downstream tasks like translation, question answering, etc.

Disadvantages:
- Computationally expensive.
- Requires large resources for training and inference.

Explain the architecture of a Seq2Seq model and its role in neural machine
translation?

Seq2Seq Model Architecture (Sequence-to-Sequence)

- The Seq2Seq model is a type of encoder-decoder neural architecture designed to map

variable-length input sequences to variable-length output sequences.
- It's heavily used in tasks like Neural Machine Translation (NMT), text summarization, and
question answering.

It mainly consists of two parts: Encoder and Decoder

1. Encoder
- The encoder is responsible for processing the entire input sequence and encoding it into a
fixed-length context vector
- It is usually implemented using RNNs, LSTMs, or GRUs.
- The encoder reads the input sentence (source language) one word at a time.

2. Context Vector
- In the vanilla Seq2Seq model, the final hidden state of the encoder hTh_ThTis called the
context vector.
- It is intended to carry all the semantic information from the input sequence to guide the
decoder during output generation.

3. Decoder
- The decoder takes the context vector from the encoder and starts generating the output
sentence (target language) one word at a time.
- It uses the context vector and the previously generated words to predict the next word.
- It continues until it produces the end-of-sequence token.

Role of Seq2Seq in Neural Machine Translation

1. Sentence Understanding: The encoder converts the full sentence into a context vector
that captures its meaning.
2. Language Generation: The decoder uses the context to generate the translated
sentence, one word at a time.
3. Flexible Input/Output Lengths: Can handle input and output sentences of different
lengths, which is common in translation.
4. Learnable from Data: Learns from parallel corpora (pairs of sentences in source and
target languages).

BLEU Score & BERT Score

BLEU Score

- BLEU is one of the oldest and most commonly used metrics for evaluating machine
translation output.
- It was introduced in 2002 by IBM researchers.
- The idea is simple: compare the machine’s output to one or more human reference
translations and check how much they match in terms of word sequences (n-grams).

How BLEU Works:

1. BLEU looks at how many n-grams (unigrams = 1 word, bigrams = 2 words, trigrams = 3
words, etc.) from the machine translation appear in the reference translation.
2. BLEU focuses on precision, i.e., how many of the words in the machine translation are
correct—not whether it captured everything from the reference.
3. If the machine translation is too short, BLEU applies a brevity penalty to reduce the
score.
4. The final BLEU score is a geometric mean of all n-gram precisions multiplied by the
brevity penalty.

BERT Score

- BERT Score is a modern evaluation metric that leverages deep learning models (like BERT)
to evaluate translation by checking the meaning (semantics) rather than just word overlap.
- It was introduced around 2019 to address the flaws in BLEU by using contextual embeddings.

How BERT Score Works:

1. BERT transforms both the candidate and reference sentences into contextual
embeddings—each word is represented as a vector that captures meaning based on context.

Example:
“He went to the bank to deposit money.” → BERT understands “bank” as a financial institution.
“He sat by the river bank.” → Now “bank” refers to a riverside. BERT captures this difference.

2. For each word in the candidate sentence, it finds the most similar word (in terms of meaning)
in the reference sentence using cosine similarity between their vector representations.

Neural Style Transfer

- Neural Style Transfer is a fascinating application of deep learning that involves the artistic
transformation of images by combining the content of one image with the style of another.
- The technique uses Convolutional Neural Networks (CNNs) to separate and recombine
content and style features from two input images.
- Neural Style Transfer typically involves a pre-trained CNN, where the content features are
extracted from the content image and the style features are extracted from the style image.
- The content and style features are then combined in a way that generates a new image with
the content of the first image but the artistic style of the second.

Example:
- Content Image: A photo of you standing in front of a building.
- Style Image: A painting by Leonardo da Vinci.
- Output: Your photo now looks like it was painted by Leonardo da Vinci, while still showing you
and the building clearly.

Applications:
1. Art Creation: Turn real photos into artwork styled like famous paintings.
2. Social Media Filters: Some Instagram and Snapchat filters use NST to give images
artistic effects.
3. Advertising & Design: Used for stylized visuals.
4. AI Art Tools: Apps like Prisma and DeepArt use NST.

How does BERT (Bidirectional Encoder Representations from

Transformers) work, and what are its advantages over traditional language
models?

- BERT is a pre-trained deep bidirectional Transformer-based language model introduced by

Google in 2018.
- Unlike traditional unidirectional models, BERT reads text in both directions (left-to-right and
right-to-left) using the Transformer encoder architecture.
- BERT is pretrained on a massive amount of text data, learning contextual embeddings for
each word in a sentence.
- This pretraining allows BERT to understand the relationships between words, making it
effective in a wide range of NLP tasks.

Working of BERT

1. Input Formatting
- Adds special tokens: [CLS] at start, [SEP] between sentences.
- Words are broken into sub-words using WordPiece.

2. Embeddings
- Each token gets Token + Segment + Position Embeddings.

3. Transformer Encoder
- Only the encoder is used.
- Applies bidirectional self-attention to understand full context.

4. Pre Training Tasks

- Masked Language Modeling (MLM): Predict masked words.
- Next Sentence Prediction (NSP): Check if sentence B follows sentence A.

5. Fine-tuning
- Model is adapted to specific tasks with task-specific layers.

6. Output
- [CLS] token embedding used for classification.
- Full token embeddings used for token-level tasks.

Advantages of BERT over Traditional Language Models ( BAPP)

1. Bidirectional Context Understanding
- Unlike traditional models that read text left-to-right or right-to-left, BERT reads both directions
at once, capturing full context of a word based on all surrounding words.

2. Ambiguity
- Since embeddings are context-aware, BERT can distinguish between different meanings of
the same word depending on the sentence.

3. Pre-trained
- BERT is pre-trained on massive data using unsupervised tasks, giving it a rich understanding
of language.

4. Parallel Processing
- Based on the Transformer architecture, BERT allows parallel computation, which speeds up
training and inference.

GenAI Workflow Automation NPTEL Zoom Course
No ratings yet
GenAI Workflow Automation NPTEL Zoom Course
88 pages
Foundations of Text Representation, LLMs and Transformers
No ratings yet
Foundations of Text Representation, LLMs and Transformers
87 pages
NLP & AI Techniques Guide
No ratings yet
NLP & AI Techniques Guide
37 pages
NLP Prez Word - Sentence Embedding - MAQUET - MARTIN - LEEFEBURE - MOGAVERO
No ratings yet
NLP Prez Word - Sentence Embedding - MAQUET - MARTIN - LEEFEBURE - MOGAVERO
18 pages
08-DL-Deep Learning For Text Data (Transfer Learning in NLP)
No ratings yet
08-DL-Deep Learning For Text Data (Transfer Learning in NLP)
53 pages
Machine Learning for NLP: Tokenization & Features
No ratings yet
Machine Learning for NLP: Tokenization & Features
37 pages
Word2Vec Overview and Techniques
No ratings yet
Word2Vec Overview and Techniques
33 pages
Thuyết Trình TWP
No ratings yet
Thuyết Trình TWP
7 pages
Cheatsheet Recurrent Neural Networks
No ratings yet
Cheatsheet Recurrent Neural Networks
5 pages
ML For NLP-LO4
No ratings yet
ML For NLP-LO4
42 pages
Understanding BERT and NLP Innovations
No ratings yet
Understanding BERT and NLP Innovations
98 pages
1725888984module 4 Deep Learning For Natural Language Processing (NLP)
No ratings yet
1725888984module 4 Deep Learning For Natural Language Processing (NLP)
15 pages
DL Unit-IV
No ratings yet
DL Unit-IV
20 pages
Assignment 05 CL
No ratings yet
Assignment 05 CL
3 pages
Ba LLMS W2 S2 2024 2025
No ratings yet
Ba LLMS W2 S2 2024 2025
47 pages
Reference Material NLP - 2
No ratings yet
Reference Material NLP - 2
40 pages
Class Notes
No ratings yet
Class Notes
43 pages
08 Word Embeddings (2021)
No ratings yet
08 Word Embeddings (2021)
58 pages
Text Representation in NLP Techniques
No ratings yet
Text Representation in NLP Techniques
57 pages
Word Representation Techniques
No ratings yet
Word Representation Techniques
4 pages
Unit IV
No ratings yet
Unit IV
58 pages
Natural Language Processing: Lecture # 7
No ratings yet
Natural Language Processing: Lecture # 7
36 pages
Embeddings
No ratings yet
Embeddings
3 pages
Cse 477 Neural Network Final Term
No ratings yet
Cse 477 Neural Network Final Term
21 pages
Unit Ii
No ratings yet
Unit Ii
20 pages
06 Wordvectors
No ratings yet
06 Wordvectors
96 pages
NLP 2
No ratings yet
NLP 2
8 pages
Word2Vec: Vector Representations Explained
No ratings yet
Word2Vec: Vector Representations Explained
31 pages
Unit 2 Updated New
No ratings yet
Unit 2 Updated New
77 pages
Lab 5
No ratings yet
Lab 5
27 pages
BDMH LLM
No ratings yet
BDMH LLM
51 pages
CCS369 Unit-2 20.12.24
No ratings yet
CCS369 Unit-2 20.12.24
41 pages
Word Vectors and Text Classification Techniques
No ratings yet
Word Vectors and Text Classification Techniques
52 pages
Unit 5b - Natural Language Processing
No ratings yet
Unit 5b - Natural Language Processing
41 pages
Tut4 - WordEmb NLP
No ratings yet
Tut4 - WordEmb NLP
30 pages
542 315 Word2vec
No ratings yet
542 315 Word2vec
20 pages
Word Vectors: Word2Vec and GloVe Explained
No ratings yet
Word Vectors: Word2Vec and GloVe Explained
39 pages
Learn 4
No ratings yet
Learn 4
27 pages
Lect 04
No ratings yet
Lect 04
44 pages
Important 2 Marks
No ratings yet
Important 2 Marks
11 pages
Generative AI
No ratings yet
Generative AI
16 pages
Unit 5 Part 2
No ratings yet
Unit 5 Part 2
21 pages
Aiml 1st Insem Vi Sem
No ratings yet
Aiml 1st Insem Vi Sem
11 pages
NLP - Module 2
No ratings yet
NLP - Module 2
54 pages
NLP Asgn2
No ratings yet
NLP Asgn2
7 pages
Complete NLP Guide - From Fundamentals To Deep Learning With TensorFlow
No ratings yet
Complete NLP Guide - From Fundamentals To Deep Learning With TensorFlow
13 pages
Chapter II
No ratings yet
Chapter II
26 pages
A M3 RD Ipjn Yd Ps GKF
No ratings yet
A M3 RD Ipjn Yd Ps GKF
20 pages
Bert Ayman
No ratings yet
Bert Ayman
5 pages
Unit 2
No ratings yet
Unit 2
48 pages
Summaries of The Chapters
No ratings yet
Summaries of The Chapters
29 pages
cs224n 2025 Lecture02 Wordvecs2
No ratings yet
cs224n 2025 Lecture02 Wordvecs2
46 pages
NLP Notes
No ratings yet
NLP Notes
11 pages
Part 3
No ratings yet
Part 3
5 pages
Web Minnig
No ratings yet
Web Minnig
30 pages
NLP Pretrained Language Models BERT and Its Variants Model Analysis ML Pretraining Finetuning
No ratings yet
NLP Pretrained Language Models BERT and Its Variants Model Analysis ML Pretraining Finetuning
71 pages
Bi Lab Manual Cl-IV Be Ai&Ds
No ratings yet
Bi Lab Manual Cl-IV Be Ai&Ds
67 pages
IJCRT PaperFormat
No ratings yet
IJCRT PaperFormat
6 pages
Capgemini Selects 2025 Batch 1
No ratings yet
Capgemini Selects 2025 Batch 1
12 pages
Business Intelligence Endsem
No ratings yet
Business Intelligence Endsem
12 pages
Plagiarism Detection Final PPT - PPTX (1) - 1
No ratings yet
Plagiarism Detection Final PPT - PPTX (1) - 1
37 pages
Ci Unit 4
No ratings yet
Ci Unit 4
14 pages
Power Bi 4
No ratings yet
Power Bi 4
20 pages
Business Intelligence Endsem
No ratings yet
Business Intelligence Endsem
10 pages
Plagiarism Report Victim Families of Will Get Justice. The Harshest Response Will Be Given 1749400347672
No ratings yet
Plagiarism Report Victim Families of Will Get Justice. The Harshest Response Will Be Given 1749400347672
1 page
Pin Codes Fo Bangalore
No ratings yet
Pin Codes Fo Bangalore
3 pages
Scale Calibration Procedures in Hospitality
No ratings yet
Scale Calibration Procedures in Hospitality
3 pages
PSLE SocialStudies 2013
No ratings yet
PSLE SocialStudies 2013
7 pages
Compare Hyundai Elite I20 (2017-2018) Vs Honda WR-V Vs Maruti Suzuki Vitara Brezza Vs Maruti Suzuki Baleno
No ratings yet
Compare Hyundai Elite I20 (2017-2018) Vs Honda WR-V Vs Maruti Suzuki Vitara Brezza Vs Maruti Suzuki Baleno
2 pages
Module 7
No ratings yet
Module 7
87 pages
Learning Style Inventory
No ratings yet
Learning Style Inventory
2 pages
Economics: Number Key Number Key
No ratings yet
Economics: Number Key Number Key
30 pages
PH YS IC S: Physics STD 12: Physics MCQ - 3
No ratings yet
PH YS IC S: Physics STD 12: Physics MCQ - 3
18 pages
IPO Underwriting Report
No ratings yet
IPO Underwriting Report
6 pages
Erebuni Yerevan - Concert Instruments
No ratings yet
Erebuni Yerevan - Concert Instruments
1 page
Functions of Central Bank: Economics
No ratings yet
Functions of Central Bank: Economics
8 pages
Paediatric Bronchoscopy Progress in Respiratory Research Kostas N. Priftis Download
No ratings yet
Paediatric Bronchoscopy Progress in Respiratory Research Kostas N. Priftis Download
53 pages
Karan Balance Sheet 31.03.204
No ratings yet
Karan Balance Sheet 31.03.204
1 page
Wan. 2" Medicine: - Puioeopathy S
100% (1)
Wan. 2" Medicine: - Puioeopathy S
244 pages
Psychological Theories in Values Education
No ratings yet
Psychological Theories in Values Education
14 pages
Career Guidance: Senior High Exit Plan
No ratings yet
Career Guidance: Senior High Exit Plan
4 pages
Watercolor Techniques for Artists
100% (2)
Watercolor Techniques for Artists
3 pages
MRP Thesis
100% (3)
MRP Thesis
6 pages
Digestive Processes in Fish Anatomy
No ratings yet
Digestive Processes in Fish Anatomy
7 pages
EPP - ICT - Creating A Multimedia Presentation Using The Advanced Features of MS PowerPoint Tool
No ratings yet
EPP - ICT - Creating A Multimedia Presentation Using The Advanced Features of MS PowerPoint Tool
27 pages
2023 TJPhO v1
No ratings yet
2023 TJPhO v1
11 pages
Unit 9 at The Beach
No ratings yet
Unit 9 at The Beach
45 pages
Hilove
No ratings yet
Hilove
2 pages
Lectures in International Marketing 2019
No ratings yet
Lectures in International Marketing 2019
61 pages
Exercise Sheet 2
No ratings yet
Exercise Sheet 2
1 page
Pha 2000 - Pha 3000 PDF
No ratings yet
Pha 2000 - Pha 3000 PDF
36 pages
English UTS Practice Questions
No ratings yet
English UTS Practice Questions
10 pages
EMX1
No ratings yet
EMX1
3 pages
Daikin Altherma Ground Source Heat Pump - Product Profile - Installers - ECPEN15-728A - English
No ratings yet
Daikin Altherma Ground Source Heat Pump - Product Profile - Installers - ECPEN15-728A - English
8 pages
Food Microbial Ecology Insights
No ratings yet
Food Microbial Ecology Insights
10 pages