Unit 1: Introduction to Natural Language Processing (NLP)
Natural Language Processing (NLP) is a field of Artificial Intelligence (AI) focused on enabling
computers to understand and process human languages.
Three Waves of NLP
1. First Wave: Rationalism (1950s–1980s): Rule-based systems using symbolic logic. Example: ELIZA
chatbot.
2. Second Wave: Empiricism (1990s): Statistical methods using large corpora. Example: IBM’s
statistical machine translation.
3. Third Wave: Deep Learning (2010s): Neural networks like RNN, CNN, and Transformers. Example:
ChatGPT, BERT.
Why NLP is Difficult?
Ambiguity – e.g., 'I saw the man with the telescope'.
Context Dependence – e.g., 'He is cool'.
Syntax and Grammar Variations.
Named Entity Recognition (NER).
Understanding idioms, sarcasm – e.g., 'Oh great! Another Monday'.
Data Sparsity and Noisy Input.
Importance of NLP
Language Understanding, Information Extraction, Machine Translation, Speech Recognition, Question
Answering, etc.
Example: Alexa, Google Translate, chatbots.
Basic NLP Terminologies
Phonology: Study of sounds. Morphology: Word structure.
Syntax: Sentence structure. Semantics: Meaning.
Pragmatics: Intent. Discourse: Sentence connections.
Example: 'Can you pass the salt?' – a polite request, not a real question.
Basic NLP Operations
1. Word Level Analysis – e.g., Tokenization, Stemming: 'running' → 'run'
2. Syntax Analysis – Sentence structure.
3. Semantic Analysis – Word Sense Disambiguation: 'bank' (river vs finance).
4. Discourse Integration – Linking across sentences.
5. Pragmatic Analysis – understanding sarcasm, context, etc.
POS Tagging
Assigns part-of-speech like noun, verb, adjective to each word.
Example: 'Ram eats mangoes' → NNP VBZ NNS
Approaches: Rule-Based, Statistical (HMM), Neural Networks (RNN, LSTM)
Sequence Labeling
Each word in a sentence is assigned a label.
Example: 'John lives in Paris' → [B-PER, O, O, B-LOC]
Natural Language Inception
Recursive use of NLP to generate/analyze more language.
Example: GPT-generated text fed to another model, chatbot loops.
Information Retrieval (IR)
Finding relevant documents based on a query.
Steps: Indexing, Query Processing, Retrieval Models (BM25), Ranking, Evaluation.
Example: Google search.
Applications of NLP
Sentiment Analysis, Text Classification, NER, Machine Translation.
Chatbots, Summarization, Question Answering, Speech Recognition, Info Extraction.
Example: Zomato chatbot, YouTube subtitles, customer feedback analysis.