0% found this document useful (0 votes)

31 views4 pages

MTE Practice Set

The document contains a series of questions and tasks related to Natural Language Processing (NLP), covering topics such as discourse meaning, vocabulary creation, Hidden Markov Models, word sense disambiguation, and text preprocessing. Each section includes specific analytical tasks, examples, and explanations aimed at understanding and applying various NLP techniques. The document serves as a comprehensive guide for exploring fundamental concepts and challenges in NLP.

Uploaded by

dehic55884

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views4 pages

MTE Practice Set

Uploaded by

dehic55884

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

1 Imagine two students, Priya and Arjun, in a university library where a group nearby is talking

loudly. Priya says to Arjun, "This noise is making it hard to study."

a. Discourse Meaning: Analyze Priya’s utterance from a discourse meaning perspective. What
is the primary information conveyed at this level?

b. Pragmatic Meaning: Analyze the utterance from a pragmatic meaning perspective. How
does the pragmatic interpretation build upon or differ from the discourse meaning in this
context?

c. Pragmatic Inferences: Identify at least two potential pragmatic inferences Arjun might make
based on Priya’s statement, considering the library context and their relationship as students.

2 Consider the following two sentences:

Sentence 1: "The clever rabbit runs from danger."

Sentence 2: "The fox chases the rabbit quickly."

a. Vocabulary Creation: Create a vocabulary of unique words from both sentences, listing
them in alphabetical order.

b. Bag-of-Words Representation: Represent each sentence using the Bag-of-Words (BOW)

model. Provide the frequency count for each word in the vocabulary for both sentences.
c. One-Hot Encoding: Create One-Hot Encoding (OHE) vectors for the words “rabbit” and
“fox” based on the vocabulary, using the alphabetical order to determine the index in the
OHE vector.

3 Analyze the limitations of the Hidden Markov Model (HMM) in Natural Language Processing
(NLP) when applied to tasks like part-of-speech tagging or speech recognition.

a. Key Limitations: Identify and explain three main limitations of HMMs in handling complex
language tasks.

b. Impact on Performance: Discuss how these limitations affect HMM performance in a

specific NLP task (e.g., part-of-speech tagging for social media text).
c. Alternatives: Suggest one alternative model or approach that addresses at least one of
these limitations, and briefly explain why it’s more effective.

4 Using the corpus: “dog runs in the park”

Tokenize the corpus and list all tokens. Create a vocabulary with word-to-ID mapping. Justify
your tokenization choices (e.g., handling punctuation, case sensitivity).
Choose a window size for a Skip-Gram model, justify your choice, and generate all (target,
context) training pairs from the numerically encoded corpus.
Discuss two advantages and two disadvantages of the Skip-Gram model for word embedding
generation in NLP.

5 Explain the core principle of the Lesk Algorithm for Word Sense Disambiguation (WSD).
Describe how it leverages dictionary definitions to disambiguate a polysemous word in a
specific context.

a. Core Principle : Outline the Lesk Algorithm’s approach, focusing on how it compares word
contexts to dictionary senses.
b. Application Example : Consider the sentence “She deposited money in the bank.” Apply the
Lesk Algorithm to disambiguate “bank” (financial institution vs. riverbank). Assume dictionary
definitions:

Sense 1 (financial): “A place where money is stored or managed.”

Sense 2 (riverbank): “The edge of a river or stream.” Show the overlap between the sentence
context and each sense, and determine the correct sense.

7 Given the text: “The fasttt fox leaps over the idle dog at the field!! #fox #wildfox #fast”

Process this text using tokenization, stopword removal, and lemmatization.

a. Tokenization : Tokenize the text, listing all tokens. Explain your tokenization choices (e.g.,
handling hashtags, extra letters, punctuation).
b. Stopword Removal : Remove stopwords from the tokens, using a standard stopword list
(e.g., “the”, “at”, “over”). List the remaining tokens and justify any exclusions.
c. Lemmatization : Apply lemmatization to the remaining tokens. Provide the final processed
output and explain how lemmatization affects each token (e.g., “leaps” to “leap

8 Consider a collection of three documents:

Document 1: “The new algorithm improves performance.”
Document 2: “Performance of the algorithm is key.”
Document 3: “This algorithm is efficient.”
a. Term Frequency (TF) : Compute the TF of “algorithm” in Document 1
b. Inverse Document Frequency (IDF) : Calculate the IDF of “algorithm” across the collection
c. TF-IDF Score: Multiply the TF and IDF values to obtain the TF-IDF score for “algorithm” in
Document 1.

9 S → NP VP
NP → DT NN
VP → VB NP
DT → the
NN → cat | dog
VB → chases

Sentence: "the cat chases the dog"

Task: Determine if the sentence is valid (i.e., can be derived from the grammar) using the CKY
algorithm.

10 Given a trigram model with vocabulary {the, cat, eats, fish} (size 4), and counts:

Trigrams: the cat eats: 5, cat eats fish: 3, others: 0

Bigrams: the cat: 7, cat eats: 4, eats fish: 3, others: 0
Total trigrams: 20
Using add-one smoothing:

- Compute smoothed trigram probabilities P(eats | the cat) and P(fish | cat eats).
- Calculate the sentence probability P(the cat eats fish)

11 Scenario: A startup is building a chatbot to handle customer inquiries for an e-commerce

platform. The chatbot needs to understand user queries like "Where is my order?" and
respond appropriately.
Question: Explain how Natural Language Processing (NLP) can enable the chatbot to achieve
this goal. Describe two key NLP tasks involved and how they contribute to understanding and
responding to customer queries.

12 Scenario: A news agency wants to analyze thousands of articles to identify trending topics.
The raw text contains uppercase letters, punctuation, and irrelevant words like "the" and "is."
Question: Design a text preprocessing pipeline for this task. List at least four preprocessing
steps, explain their purpose, and provide an example of how the sentence "The Quick Fox
Jumps!" is transformed after each step.

13 Scenario: A language learning app aims to help students understand sentence structure by
highlighting parts of speech in sentences like "The dog runs quickly."
Question: Explain how POS tagging can be used to achieve this. Provide the expected POS
tags for the given sentence and describe one challenge in accurately tagging ambiguous
words like "runs" (verb vs. noun).

14 Why is ambiguity a major challenge in NLP? Provide an example of a sentence with multiple
interpretations.

15 A blog analysis tool processes posts with mixed case and punctuation, like "Amazing Trip!!!".
Show the output after applying tokenization, lowercasing, and stop word removal.

16 What is the difference between stemming and lemmatization? Provide an example where
lemmatization is preferred.

17 A language learning tool tags words in "She runs fast" to teach grammar. Provide the POS tags
and explain how one ambiguous word could be mistagged.

18 What are two limitations of the Bag of Words model in capturing text meaning?

19 Explain how TF-IDF weights words differently from raw frequency counts. Why is the IDF
component important?

20 Compare one-hot encoding to dense embeddings like Word2Vec in terms of memory

efficiency and semantic representation.

21 Explain how Word2Vec creates word embeddings and what makes them capture semantic
relationships.

22 A job portal matches resumes to job postings, e.g., linking "engineer" to "technician". Explain
how Word2Vec helps and suggest whether CBOW or Skip-gram is better for this task.

23 Explain the difference between syntactic, semantic, and pragmatic analysis in NLP. Provide an
example of how each level of analysis contributes to understanding a sentence like "Can you
open the window?"
24 Describe the mathematical formulation of TF-IDF. Why does the inverse document frequency
(IDF) component amplify the importance of rare terms in a corpus?

25 Why does the Bag of Words model fail to capture contextual relationships between words?
Discuss how this limitation impacts tasks like sentiment analysis.

26 Explain why one-hot encoding leads to sparse representations in NLP. How does this sparsity
affect the scalability of models for large vocabularies?

27 Discuss the role of discourse analysis in NLP. How does it differ from syntactic and semantic
analysis in processing multi-sentence texts?

28 Describe how n-grams balance context and computational complexity in language modeling.
Why does increasing n (e.g., from bigrams to trigrams) improve accuracy but exacerbate data
sparsity?

Practice Set NLP
No ratings yet
Practice Set NLP
5 pages
NLP Sem
No ratings yet
NLP Sem
4 pages
Comprehensive NLP Practice Assignment
No ratings yet
Comprehensive NLP Practice Assignment
2 pages
NLP Comprehensive Study Guide Pokhara University Fall 2025
No ratings yet
NLP Comprehensive Study Guide Pokhara University Fall 2025
50 pages
Lucas Paquetta Raw NLP
No ratings yet
Lucas Paquetta Raw NLP
12 pages
Interaction Between Computers and Human Language
No ratings yet
Interaction Between Computers and Human Language
15 pages
NLP Previous Sem
No ratings yet
NLP Previous Sem
5 pages
Practice Problems of NLP
No ratings yet
Practice Problems of NLP
3 pages
NLP Previous Sem-1-3
No ratings yet
NLP Previous Sem-1-3
3 pages
NLP Study Material
No ratings yet
NLP Study Material
8 pages
NLP Sem QB
No ratings yet
NLP Sem QB
4 pages
Q ClassX AI Ch7
No ratings yet
Q ClassX AI Ch7
6 pages
NLP Question Bank Overview
No ratings yet
NLP Question Bank Overview
21 pages
NLP Concepts for Class X Students
No ratings yet
NLP Concepts for Class X Students
3 pages
NLP Sample Questions-Stu
No ratings yet
NLP Sample Questions-Stu
4 pages
NLP 2
No ratings yet
NLP 2
45 pages
Ut1 Notes
No ratings yet
Ut1 Notes
5 pages
NLP Final Review
No ratings yet
NLP Final Review
32 pages
Ch-6 Natural Language Processing Q&A's
100% (1)
Ch-6 Natural Language Processing Q&A's
8 pages
NLP Basics for Beginners
No ratings yet
NLP Basics for Beginners
4 pages
SNLP
No ratings yet
SNLP
18 pages
Updated Ethical Hacking QB
No ratings yet
Updated Ethical Hacking QB
35 pages
NLP New QB
No ratings yet
NLP New QB
3 pages
Nlpiat1word 1
No ratings yet
Nlpiat1word 1
11 pages
X - AI-NLP Worksheet
No ratings yet
X - AI-NLP Worksheet
2 pages
It3ea06 Natural Lanuage Processing
No ratings yet
It3ea06 Natural Lanuage Processing
4 pages
NLP Question Bank: Chapter-Wise Practice Problems With Solutions
No ratings yet
NLP Question Bank: Chapter-Wise Practice Problems With Solutions
45 pages
Answer NLP
No ratings yet
Answer NLP
5 pages
Qns
No ratings yet
Qns
6 pages
NLP Challenges & Techniques
No ratings yet
NLP Challenges & Techniques
45 pages
NLP Applications and Techniques
No ratings yet
NLP Applications and Techniques
7 pages
Question Bank
No ratings yet
Question Bank
3 pages
Chapter 7.1 - Introducing Natural Language Processing
No ratings yet
Chapter 7.1 - Introducing Natural Language Processing
39 pages
Sample Questions: Subject Name: Semester: VIII
No ratings yet
Sample Questions: Subject Name: Semester: VIII
7 pages
Al3501 - Teaching Content
No ratings yet
Al3501 - Teaching Content
3 pages
NLP Key
No ratings yet
NLP Key
16 pages
CM3060 Past Paper September 2024
No ratings yet
CM3060 Past Paper September 2024
5 pages
NLP Worksheet
No ratings yet
NLP Worksheet
3 pages
Keyword Techniques in Text Processing
No ratings yet
Keyword Techniques in Text Processing
28 pages
CS-875-Lecture 4
No ratings yet
CS-875-Lecture 4
47 pages
Unit 6 Endsem PYQs
No ratings yet
Unit 6 Endsem PYQs
15 pages
VND Openxmlformats-Officedocument Wordprocessingml Document&rendition 1
No ratings yet
VND Openxmlformats-Officedocument Wordprocessingml Document&rendition 1
5 pages
NLP Exam Solutions for CSE Students
No ratings yet
NLP Exam Solutions for CSE Students
6 pages
NLP Exam Prep for Engineering Students
No ratings yet
NLP Exam Prep for Engineering Students
52 pages
Mock Interview Question - NLP
No ratings yet
Mock Interview Question - NLP
3 pages
CSE 3652 Lab Record Format - PDF
No ratings yet
CSE 3652 Lab Record Format - PDF
13 pages
NLP Pyq
No ratings yet
NLP Pyq
6 pages
NLP Sem 7 Imp Questions
No ratings yet
NLP Sem 7 Imp Questions
11 pages
NLP Final
No ratings yet
NLP Final
33 pages
NLP Important Question and Answers Module Wise
No ratings yet
NLP Important Question and Answers Module Wise
101 pages
NLP Unit 1 PDF
No ratings yet
NLP Unit 1 PDF
15 pages
NLP Revision Notes
No ratings yet
NLP Revision Notes
6 pages
NLP Essentials for AI Enthusiasts
No ratings yet
NLP Essentials for AI Enthusiasts
4 pages
Assignment I
No ratings yet
Assignment I
6 pages
MAPEH-8 PE Lesson Plan: Weeks 1-4
No ratings yet
MAPEH-8 PE Lesson Plan: Weeks 1-4
9 pages
Test Bank For Programming Logic and Design Comprehensive 9th Edition
No ratings yet
Test Bank For Programming Logic and Design Comprehensive 9th Edition
18 pages
Family Line Guitar Chords
100% (1)
Family Line Guitar Chords
4 pages
Kusanthula Kuimba Kwa Mlakatuli-Landscape
No ratings yet
Kusanthula Kuimba Kwa Mlakatuli-Landscape
30 pages
Amazing Grace - Pentatonix Full Transcription With Lyrics
No ratings yet
Amazing Grace - Pentatonix Full Transcription With Lyrics
19 pages
Astn/Ason and Gmpls Overview and Comparison: By, Kishore Kasi Udayashankar Kaveriappa Muddiyada K
No ratings yet
Astn/Ason and Gmpls Overview and Comparison: By, Kishore Kasi Udayashankar Kaveriappa Muddiyada K
44 pages
Peerless Dad Chapters 257-274
No ratings yet
Peerless Dad Chapters 257-274
25 pages
RC 4 RC 4HA RC 4HC Instructions
No ratings yet
RC 4 RC 4HA RC 4HC Instructions
2 pages
South Asia Shortwave Radio Guide
No ratings yet
South Asia Shortwave Radio Guide
15 pages
Speech Language Disorder in Children An Overview
No ratings yet
Speech Language Disorder in Children An Overview
8 pages
Unit 1 Culture Video Worksheet
No ratings yet
Unit 1 Culture Video Worksheet
4 pages
Why Study Literature? 20 Reasons
No ratings yet
Why Study Literature? 20 Reasons
3 pages
RDP Troubleshooting & Debugging Guide
No ratings yet
RDP Troubleshooting & Debugging Guide
8 pages
Revised Datesheet of Annual Examination
No ratings yet
Revised Datesheet of Annual Examination
1 page
SIP2 Protocol Definition
No ratings yet
SIP2 Protocol Definition
31 pages
Datasheet Axis d3110 Connectivity Hub en US 405294
No ratings yet
Datasheet Axis d3110 Connectivity Hub en US 405294
2 pages
Exadata Exam
100% (1)
Exadata Exam
16 pages
BS7 Term 1 WK6 - Rme
No ratings yet
BS7 Term 1 WK6 - Rme
2 pages
Java I/O Stream Classes Guide
No ratings yet
Java I/O Stream Classes Guide
5 pages
Blazor CheatSheet
100% (1)
Blazor CheatSheet
10 pages
Symbolism of Stūpas in Bagan
No ratings yet
Symbolism of Stūpas in Bagan
24 pages
Pagsasaling-Wika (Tayutay, Idyoma, Talumpati at Tula)
No ratings yet
Pagsasaling-Wika (Tayutay, Idyoma, Talumpati at Tula)
20 pages
สรุปแกรมม่า
No ratings yet
สรุปแกรมม่า
2 pages
Unit 3: Favorite People Activities
No ratings yet
Unit 3: Favorite People Activities
8 pages
English-10-1 2
No ratings yet
English-10-1 2
3 pages
Understanding French Adjectives
No ratings yet
Understanding French Adjectives
32 pages
Nse4 Manual Fortinet
100% (8)
Nse4 Manual Fortinet
46 pages
12th Nights Themes
No ratings yet
12th Nights Themes
4 pages
Cognitive Neuroscience CV
No ratings yet
Cognitive Neuroscience CV
13 pages
19 Playing Dress Up - Wrapper
No ratings yet
19 Playing Dress Up - Wrapper
11 pages

MTE Practice Set

Uploaded by

MTE Practice Set

Uploaded by

1 Imagine two students, Priya and Arjun, in a university library where a group nearby is talking

loudly. Priya says to Arjun, "This noise is making it hard to study."

2 Consider the following two sentences:

Sentence 1: "The clever rabbit runs from danger."

Sentence 2: "The fox chases the rabbit quickly."

b. Bag-of-Words Representation: Represent each sentence using the Bag-of-Words (BOW)

b. Impact on Performance: Discuss how these limitations affect HMM performance in a

4 Using the corpus: “dog runs in the park”

Sense 1 (financial): “A place where money is stored or managed.”

Process this text using tokenization, stopword removal, and lemmatization.

8 Consider a collection of three documents:

Sentence: "the cat chases the dog"

Trigrams: the cat eats: 5, cat eats fish: 3, others: 0

11 Scenario: A startup is building a chatbot to handle customer inquiries for an e-commerce

20 Compare one-hot encoding to dense embeddings like Word2Vec in terms of memory

You might also like