0% found this document useful (0 votes)
3 views5 pages

NLP Descriptive Answers Simple

The document discusses various applications of Natural Language Processing (NLP) including text summarization, spam detection, and language models. It explains concepts such as syntactic vs. semantic analysis, n-grams, perplexity, POS tagging, and word embeddings, highlighting their roles and differences. Additionally, it covers techniques like CBOW and Skip-gram in Word2Vec, emphasizing their advantages and limitations.

Uploaded by

harshmakadiya15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views5 pages

NLP Descriptive Answers Simple

The document discusses various applications of Natural Language Processing (NLP) including text summarization, spam detection, and language models. It explains concepts such as syntactic vs. semantic analysis, n-grams, perplexity, POS tagging, and word embeddings, highlighting their roles and differences. Additionally, it covers techniques like CBOW and Skip-gram in Word2Vec, emphasizing their advantages and limitations.

Uploaded by

harshmakadiya15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

1.

Text Summarization Tool using NLP

a) What is NLP and how does it help? (1.5 marks)

NLP means teaching computers to understand human language. It helps to quickly read, understand, and

work with large text files like news articles.

b) Difference: Syntactic vs. Semantic analysis (1.5 marks)

- Syntactic = Grammar check.

Example: Finding subject and verb in She runs fast.

- Semantic = Meaning check.

Example: Understanding Apple means fruit or company depending on sentence.

c) How do n-grams help? (2 marks)

N-grams are word pairs/triples like global warming or new policy. They help find common phrases in the text

which can be used in summaries.

2. Spam Detection using NLP

a) What is NLP and why is it useful for spam detection? (1.5 marks)

NLP helps understand email text, so we can check if its spam based on words and patterns used.

b) Rule-based vs. Machine learning (1.5 marks)

- Rule-based: Uses fixed rules.

Example: Mark email as spam if it says You won a prize.

- ML-based: Learns from past spam emails.

Example: Naive Bayes model trained on many spam and not-spam emails.
c) How do n-grams help detect spam? (2 marks)

Spam emails often repeat word patterns like free money now. N-gram models learn these patterns to detect

spam easily.

3. Perplexity in Language Models

a) What is perplexity? (2 marks)

It checks how well a model can guess the next word in a sentence. Lower perplexity = better model.

b) Calculate perplexity (3 marks)

Given:

P(Dogs) = 0.25

P(bark | Dogs) = 0.15

P(at | bark) = 0.1

P(night | at) = 0.2

P(loudly | night) = 0.1

Multiply all:

0.25 0.15 0.1 0.2 0.1 = 0.000075

Take 5th root of 1 / 0.000075 Perplexity 7.92

4. Bigrams and Smoothing

a) Bigrams starting with AI: (1 mark)

AI solves, AI learns

b) Why raw bigrams are bad? (1 mark)


If a pair never appears, its chance becomes 0. The model will fail for new word pairs.

c) Add-1 smoothing: P(solves | AI) (3 marks)

Use formula:

P = (count(AI solves) + 1) / (count(AI) + total_words)

= (1+1)/(2+7) = 2/9 0.22

5. POS Tagging and NER

a) What is POS tagging? (2 marks)

It labels each words job in a sentence.

Examples:

- "She eats cake." eats = verb

- "The cake is tasty." cake = noun

b) What is NER? (2 marks)

NER finds names of people, places, etc.

Examples:

- "India" = Location

- "Elon Musk" = Person

c) How POS helps NER? (1 mark)

It shows which words are nouns or proper nouns, helping to find names more accurately.

6. CBOW (Word2Vec)

a) Goal of CBOW? (1.5 marks)


It learns word meanings by guessing a word using nearby words.

b) How CBOW works? (2 marks)

In The cat sat on the mat, to guess sat, CBOW looks at The, cat, on, the.

c) One advantage and one limitation (1.5 marks)

+ Fast training

Not great for rare words

7. Skip-gram (Word2Vec)

a) Goal of Skip-gram? (1.5 marks)

It guesses nearby words using the current word.

b) Example: (2 marks)

In AI solves problems, for word solves, skip-gram tries to guess AI and problems.

c) Advantage and limitation (1.5 marks)

+ Good for rare words

Slower than CBOW

8. Word Embeddings

a) What are embeddings vs. one-hot? (4 marks)

- One-hot: Only says if word is present like [0, 0, 1, 0]

- Embeddings: Real numbers that show word meaning like [0.2, -0.3, 0.7]

Embeddings help find similar words (e.g., king & queen).


b) How Word2Vec learns embeddings? (3 marks)

Trains a small neural network:

- CBOW: uses surrounding words to guess the middle word.

- Skip-gram: uses middle word to guess surrounding words.

c) vec("king") - vec("man") + vec("woman") vec("queen") (3 marks)

This shows word vectors capture meaning and gender.

Useful in search, translation, and chatbot understanding.

You might also like