KOE088: Natural Language Processing
Unit III – Grammars and Parsing
Unit III: Grammars and Parsing
Is unit mein hum NLP ka ek aur important topic padhenge – Grammars aur Parsing.
Grammar language ka structure define karta hai aur parsing ka matlab hota hai
sentence ka structure samajhna, jisse machine ko correct analysis milta hai.
1. Grammar Kya Hoti Hai?
Grammar ek set of rules hota hai jo batata hai ki words ko sentence me kaise
arrange kiya jaye. NLP me grammar ka kaam hota hai machine ko sentence ke parts
aur unke relations samjhana.
Example:
Sentence: “The cat sat on the mat.”
Grammar isme batata hai ki:
- “The” = article
- “cat” = noun
- “sat” = verb
- “on the mat” = prepositional phrase
2. Sentence Structure aur Parse Tree
Parse Tree ek diagram hota hai jo sentence ke grammatical structure ko dikhata hai.
Example:
Sentence: “She eats an apple.”
Parse Tree:
S
/\
NP VP
| /\
She V NP
| |
eats apple
Yahan:
- S = Sentence
- NP = Noun Phrase
- VP = Verb Phrase
- V = Verb
3. Top-Down and Bottom-Up Parsing
a) Top-Down Parsing:
Isme parsing tree root (sentence level) se start hota hai aur leaves (words) tak jata
hai.
Jaise teacher pehle sentence ka type batata hai, fir uske parts.
b) Bottom-Up Parsing:
Yeh leaves se start hota hai (i.e., words) aur sentence tak build karta hai.
Jaise student pehle shabdon ko samajhta hai aur fir pura sentence ka arth.
Example:
Sentence: “The boy plays.”
Top-down: Start from S → NP VP → Det N V
Bottom-up: Start from words and group into NP and VP
4. Transition Network Grammars (TNG)
Transition network ek finite state machine hota hai jisme states aur transitions hote
hain. Har transition ek word type ya phrase ko represent karta hai.
Example:
State A —(Det)→ B —(Noun)→ C —(Verb)→ D
Input: “The cat sleeps.”
- “The” = Det → transition A to B
- “cat” = Noun → B to C
- “sleeps” = Verb → C to D
Yeh method useful hai dialogue systems aur interactive bots me.
5. Top-Down Chart Parsing
Yeh parsing method efficiency improve karta hai by avoiding repetition. Chart ek
table hota hai jisme sub-results store hote hain. Agar ek sub-part already parse ho
chuka hai, to dobara nahi parse karte.
Yeh technique useful hai jab ambiguous sentence ho.
Example:
Sentence: “I saw the man with the telescope.”
Do meanings ho sakte hain:
- Maine telescope se man ko dekha.
- Jo man tha, uske paas telescope tha.
Chart parsing dono interpretations ko store karta hai.
6. Feature Systems and Augmented Grammars
Simple grammar har sentence ka meaning nahi samajh sakti. Augmented grammars
aur feature systems help karte hain complex structures handle karne me.
a) Feature System:
Features define karte hain word ke properties jaise gender, number, tense etc.
Example:
Word: “boys” → Features: {number: plural, gender: male}
Verb: “play” → Features: {tense: present, number: plural}
Feature matching ensure karta hai ki subject-verb agreement ho.
b) Augmented Grammar:
Grammar rules me extra features add kiye jaate hain.
Example Rule:
S → NP[Number=?n] VP[Number=?n]
Yeh rule ensure karta hai ki noun aur verb ka number match kare.
7. Morphological Analysis
Morphology ka matlab hota hai words ka internal structure. NLP me morphological
analysis ka use hota hai words ko root form me convert karne me.
Example:
- Played → root = play, tense = past
- Singing → root = sing, form = progressive
Tools jaise Porter Stemmer ya Lemmatizer morphological analysis me use hote hain.
Importance:
- Search engine me better results
- Translation me accurate tense/form
8. Lexicon
Lexicon ek dictionary hoti hai jisme words aur unke features store hote hain.
Example entry:
Word: “run”
- POS: verb
- Tense: present/past
- Form: base/gerund
Lexicon NLP system ka knowledge base hota hai.
9. Parsing with Features
Parsing me jab grammar aur lexicon ke features ka use hota hai, to usse feature-
based parsing kehte hain.
Example:
Sentence: “He walks.”
“He” = singular noun
“walks” = singular verb
Feature-based parser check karega ki noun aur verb match karein.
Agar sentence hota: “They walks.” → Error detect hoga (they = plural, walks =
singular)
10. Augmented Transition Networks (ATNs)
ATN transition network grammar ka advanced version hai. Isme grammar rules ke
sath memory aur recursive functions bhi hote hain.
Example:
Sentence: “Ravi, who is my friend, plays football.”
Yeh sentence complex hai kyunki isme embedded clause hai. ATNs ise handle kar
sakte hain.
ATN me:
- Nodes = states
- Arcs = transitions (with grammar rules)
- Registers = memory to store information
11. Practical Applications
a) Grammar Checkers (e.g., Grammarly)
- Sentence parse karke grammar errors detect karte hain.
b) Voice Assistants (e.g., Siri, Alexa)
- User ke sentence ko parse karke action samajhte hain.
c) Machine Translation
- Source sentence ko parse karke target language me sahi translation karte hain.
d) Chatbots
- User ke intent ko parse karke suitable reply dete hain.
12. Real-Life Examples of Parsing
Example 1:
Input: “I saw her duck.”
- “duck” noun bhi ho sakta hai (bird)
- “duck” verb bhi ho sakta hai (bend down)
Parser context se decide karta hai meaning.
Example 2:
Input: “Time flies like an arrow.”
Yeh sentence multiple ways me parse ho sakta hai. Parsing techniques yeh
ambiguity resolve karti hain.
13. Challenges in Parsing
- Ambiguity: Multiple parse trees ho sakte hain
- Long sentences: Parse karna slow ho sakta hai
- Incorrect grammar: Real-life me log galat sentences likhte hain
- Multiple languages: Har language ka grammar alag hota hai
14. Popular Tools for Parsing
- NLTK (Python): CFG, chart parsing, shift-reduce parsing
- SpaCy: Dependency parsing
- Stanford Parser: PCFG-based parsing
- SyntaxNet (Google): Neural network-based parsing
Conclusion:
Grammars aur parsing NLP ke foundation hain. Grammar define karta hai sentence
ka sahi structure aur parsing help karta hai machine ko us structure ko understand
karne me. NLP systems jaise grammar checkers, voice assistants, chatbots in
concepts par based hote hain.
Agle unit me hum natural language ke deeper grammar features aur parsing
preferences explore karenge.