NLP Unit-I

Uploaded by

Hari pavan Kota (21471A1231)

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views

NLP Unit-I

Uploaded by

Hari pavan Kota (21471A1231)

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Unit – 1

NLP, or Natural Language Processing, is a subfield of artificial intelligence (AI)

that focuses on the interaction between computers and human language. It
involves the development of algorithms and techniques that enable
computers to understand, interpret, and generate human language in a
meaningful way.

Applications of NLP have become increasingly widespread and influential

across various industries. Here are some common examples:
1. Sentiment Analysis: NLP can be used to analyze and understand
people's opinions, sentiments, and emotions expressed in text data.
This is valuable for businesses to gauge customer feedback, assess
public opinion, and make informed decisions based on sentiment
analysis.
2. Machine Translation: NLP plays a crucial role in machine translation
systems, such as Google Translate. These systems use NLP techniques
to automatically translate text or speech from one language to
another, enabling communication and collaboration across different
cultures and languages.
3. Information Extraction: NLP algorithms can extract structured
information from unstructured text, such as identifying names, dates,
locations, and other relevant entities from documents or web pages.
This is useful for tasks like document categorization, data mining, and
knowledge graph construction.
4. Question Answering Systems: NLP is used to build question answering
systems that can understand and respond to user queries in a natural
language format. These systems are employed in chatbots, virtual
assistants, and customer support platforms to provide accurate and
timely information to users.
5. Text Summarization: NLP techniques can automatically generate
summaries of long texts, allowing users to quickly grasp the main
ideas and important details without reading the entire document. This
is beneficial for news articles, research papers, and legal documents,
among other content types.
6. Speech Recognition and Voice Assistants: NLP enables speech
recognition systems to convert spoken language into written text,
making it possible for voice assistants like Siri, Alexa, and Google
Assistant to understand and respond to voice commands.
7. Named Entity Recognition (NER): NLP can identify and classify named
entities (such as names, organizations, locations) in text data. NER is
useful for various applications, including information retrieval,
recommendation systems, and text mining.
8. Chatbots and Virtual Assistants: NLP is a fundamental component of
chatbot and virtual assistant technologies. These systems use NLP
algorithms to understand user queries, generate appropriate
responses, and engage in interactive conversations.
These are just a few examples of the numerous applications of NLP. The field
continues to evolve, and with the advancements in deep learning and neural
networks, NLP is expected to have an even greater impact on our daily lives in
the future.
NLP phases:
There are the following five phases of NLP:

1. Lexical Analysis: Analyzing the text at the word level, including part-of-speech
tagging, named entity recognition, and morphological analysis.
2. Syntactic Analysis: Analyzing the grammatical structure of sentences, including
parsing, constituency, and dependency parsing.
3. Semantic Analysis: Extracting the meaning or semantics of text, including
semantic role labeling, word sense disambiguation, and sentiment analysis.
4. Discourse Analysis: Analyzing the organization and coherence of a larger piece
of text, including coreference resolution and coherence modeling.
5. Pragmatic Analysis: Understanding the implied meaning, context, and
intention behind the text, including speech act recognition and conversational
analysis.

Ambiguity is a common challenge in natural language processing (NLP) due

to the inherent complexity and richness of human languages. It refers to
situations where a word, phrase, or sentence can have multiple possible
meanings or interpretations, making it difficult for NLP systems to accurately
understand and process the intended message.

There are several types of ambiguity that can arise in NLP:

1. Lexical Ambiguity: This type of ambiguity arises from words that have
multiple meanings or senses. For example, the word "bank" can refer to
a financial institution or the side of a river. Resolving lexical ambiguity
requires considering the context in which the word is used.
2. Syntactic Ambiguity: Syntactic ambiguity occurs when a sentence can
be parsed or interpreted in multiple ways due to different possible
syntactic structures. For example, consider the sentence "I saw the man
with the telescope." It can be interpreted as "I saw the man who was
holding the telescope" or "I used a telescope to see the man." Resolving
syntactic ambiguity requires understanding the relationships between
words and their syntactic roles.
3. Semantic Ambiguity: Semantic ambiguity arises when a sentence or
phrase has multiple possible interpretations based on the intended
meaning. For example, the phrase "Time flies like an arrow" can be
interpreted in different ways, such as "Time passes quickly, just like an
arrow" or "Flies, like an arrow, measure time." Resolving semantic
ambiguity requires considering the broader context and understanding
the intended meaning of the message.
4. Referential Ambiguity: Referential ambiguity occurs when pronouns or
other reference expressions lack clarity about what they refer to. For
example, in the sentence "John told Mary that he bought a car," the
pronoun "he" could refer to either John or Mary. Resolving referential
ambiguity requires identifying the antecedent or referent based on the
context.
5. Pragmatic Ambiguity: Pragmatic ambiguity arises when the intended
meaning of a statement depends on the speaker's intentions, implied
meaning, or the context of the conversation. This includes phenomena
such as irony, sarcasm, or indirect speech acts, where the literal meaning
may differ from the intended meaning. Resolving pragmatic ambiguity
often requires a deeper understanding of the social and cultural
context.

Dealing with ambiguity in NLP is a complex task that often

requires sophisticated techniques and context-aware models. Resolving
ambiguity relies on leveraging contextual cues, employing statistical
approaches, utilizing linguistic knowledge, and taking advantage of larger
discourse or domain knowledge. Researchers and practitioners in NLP
continuously work on developing methods to improve the accuracy and
robustness of systems in handling various forms of ambiguity.

Spelling errors pose a challenge in NLP because they can affect the accuracy
and effectiveness of many natural language processing tasks, such as text
classification, information retrieval, sentiment analysis, and machine
translation.

Here are a few reasons why spelling errors can be problematic in NLP:

1. Ambiguity: Spelling errors can introduce ambiguity, as a misspelled

word may resemble multiple correct words. This can lead to incorrect
interpretations and inaccurate results. For example, a misspelled word
like "their" as "thier" could be mistaken for "thief" or "thirst" without proper
context.
2. Out-of-vocabulary (OOV) words: Spelling errors can result in out-of-
vocabulary words that are not present in the vocabulary or the training
data of NLP models. This can impact the performance of models that
rely on pre-defined word representations or language models.
3. Information retrieval: In search or information retrieval systems, spelling
errors can hinder the retrieval of relevant documents or results. If a
user misspells a query term, the system may struggle to find the desired
information unless it has effective error-correction mechanisms.
4. Language models and prediction: Spelling errors can lead to incorrect
predictions in language models or machine translation systems. A
single misspelled word can affect the overall coherence and fluency of
the generated text.

Addressing spelling errors in NLP typically involves employing techniques for

spell checking and correction.These techniques can include rule-based
methods, statistical approaches, or machine learning algorithms that utilize
language models or sequence-to-sequence models. Additionally, pre-
trained language models like BERT or GPT can sometimes handle minor
spelling errors by capturing the context and providing the correct
interpretation.

The Noisy Channel Model can be represented mathematically using Bayes'

theorem. The formulaic process involves calculating the probability of the
intended message given the received message, taking into account the
probabilities of different possible messages and the probabilities of different
sources of noise or errors.

Let's break down the formulaic process of the Noisy Channel Model:
1. Intended Message: Let's denote the intended message as M. This is the
message that the sender wants to transmit through the channel.
2. Received Message: Let's denote the received message as R. This is the
message that the receiver actually receives, which may contain errors
or noise due to the transmission process.
3. Prior Probability: P(M) represents the prior probability of the intended
message M. It is the probability of the sender choosing the message M
to transmit.
4. Likelihood Probability: P(R | M) represents the likelihood probability of
receiving the message R given the intended message M. It is the
probability of the received message R, given that the intended message
was M. This accounts for the noise or errors introduced during
transmission.
5. Marginal Probability: P(R) represents the marginal probability of
receiving the message R. It is the probability of receiving the message R,
irrespective of the intended message. It is calculated by summing over
all possible messages M:

P(R) = ∑[P(R | M) * P(M)]

6. Posterior Probability: P(M | R) represents the posterior probability of the

intended message M given the received message R. It is the probability
of the intended message being M, given the received message R.
According to Bayes' theorem, it can be calculated as:

P(M | R) = (P(R | M) * P(M)) / P(R)

By calculating the posterior probability for different possible messages M, the

receiver can estimate the most likely intended message given the received
message R.

The Noisy Channel Model helps in understanding how the probabilities of

different messages and sources of noise interact to determine the most
likely interpretation of the received message. It forms the basis for statistical
approaches in NLP, such as machine translation or error correction, where the
goal is to decode the received message and recover the intended message
by considering the probabilities involved in the communication process.

In the context of Natural Language Processing (NLP), formal grammars play a

crucial role in modeling the structure and syntax of the English language. One
commonly used formal grammar for English is the Context-Free Grammar
(CFG). CFG is a set of rules that define how sentences in a language can be
formed by combining different parts of speech and constituents.

Here is a simplified example of a CFG for English:

1. Sentence -> Subject Verb Object

2. Subject -> Noun Phrase
3. Noun Phrase -> Article Noun
4. Verb -> Verb Phrase
5. Verb Phrase -> Verb Adverb
6. Object -> Noun Phrase

In this CFG, the rules define the structure of sentences in terms of subject, verb,
and object. Each rule specifies how different constituents can be combined to
form valid sentence structures. For example, rule 1 states that a sentence can
be formed by combining a subject, verb, and object. Rule 2 states that a
subject can be a noun phrase, and rule 3 defines a noun phrase as an article
followed by a noun. Similarly, rules 4, 5, and 6 define the structure of verbs,
verb phrases, and objects.

NLP systems use formal grammars like CFG to parse and analyze the syntactic
structure of sentences. By applying the rules of the grammar, the system can
determine the parts of speech, identify phrases, and establish relationships
between different constituents in a sentence. This analysis is essential for
tasks like parsing, part-of-speech tagging, and syntactic analysis.

It's worth noting that CFG is a simplified formal grammar, and there are more
advanced grammatical frameworks used in NLP, such as Dependency
Grammar, Head-Driven Phrase Structure Grammar (HPSG), and Lexical
Functional Grammar (LFG). These frameworks provide more detailed and
nuanced representations of the grammatical structure of languages,
including English.

Formal grammars, along with other linguistic resources and algorithms, serve
as the foundation for building NLP systems that can understand, generate,
and process natural language effectively.

Bec V Reading Part 2
100% (1)
Bec V Reading Part 2
10 pages
026213375x.mit Press - Dynamic Asymmetry - Moro, Andrea - Sep.2000
No ratings yet
026213375x.mit Press - Dynamic Asymmetry - Moro, Andrea - Sep.2000
154 pages
AP World Cricket DBQ
No ratings yet
AP World Cricket DBQ
2 pages
1ST Periodical Test in English 4
100% (19)
1ST Periodical Test in English 4
6 pages
Guide To Common Grammatical Errors in Official Writing
No ratings yet
Guide To Common Grammatical Errors in Official Writing
142 pages
Unit V
No ratings yet
Unit V
16 pages
SemVII_NaturalLanguageProcessing
No ratings yet
SemVII_NaturalLanguageProcessing
32 pages
CH1
No ratings yet
CH1
87 pages
6CS4 AI Unit-5
No ratings yet
6CS4 AI Unit-5
65 pages
NLP Ass 1&2
No ratings yet
NLP Ass 1&2
18 pages
Natural Language Processing
From Everand
Natural Language Processing
Ajit Singh
No ratings yet
Unit - 1
No ratings yet
Unit - 1
9 pages
ML QBF
No ratings yet
ML QBF
13 pages
NLP Self Notes
No ratings yet
NLP Self Notes
12 pages
Unit 4 NLP Notes
No ratings yet
Unit 4 NLP Notes
35 pages
What Is NLP?: Natural Language Processing in AI
No ratings yet
What Is NLP?: Natural Language Processing in AI
5 pages
NLP IA1
No ratings yet
NLP IA1
7 pages
Unit - 1 Introduction
No ratings yet
Unit - 1 Introduction
33 pages
NLP QB
No ratings yet
NLP QB
14 pages
Shivangi Tyagi (NLP Assignments)
No ratings yet
Shivangi Tyagi (NLP Assignments)
60 pages
NLP Unit 1 Notes
100% (1)
NLP Unit 1 Notes
19 pages
NLP Introduction Overview
No ratings yet
NLP Introduction Overview
34 pages
CL Unit 1
No ratings yet
CL Unit 1
11 pages
NLP Question bank
No ratings yet
NLP Question bank
27 pages
notes
No ratings yet
notes
9 pages
2 INTRODUCTION
No ratings yet
2 INTRODUCTION
15 pages
Group Assignment: Unit One
No ratings yet
Group Assignment: Unit One
27 pages
NLP[1]
No ratings yet
NLP[1]
8 pages
1.introduction To Natural Language Processing (NLP)
100% (1)
1.introduction To Natural Language Processing (NLP)
37 pages
ai-unit4
No ratings yet
ai-unit4
36 pages
NATURAL LANGUAGE PROCESSING ..
No ratings yet
NATURAL LANGUAGE PROCESSING ..
20 pages
Natural Language Processing: Bachelor of Technology Computer Science and Engineering
No ratings yet
Natural Language Processing: Bachelor of Technology Computer Science and Engineering
7 pages
NLP Qna Sem 7 2024 18 11 05 03 29 1
No ratings yet
NLP Qna Sem 7 2024 18 11 05 03 29 1
37 pages
NLP SEM IMP
No ratings yet
NLP SEM IMP
46 pages
What Is NLP?
No ratings yet
What Is NLP?
5 pages
NLP
No ratings yet
NLP
14 pages
01 - Intro NLP
No ratings yet
01 - Intro NLP
13 pages
What Is NLP
No ratings yet
What Is NLP
3 pages
NLP Ambiguity
No ratings yet
NLP Ambiguity
35 pages
Lecture1
No ratings yet
Lecture1
16 pages
Basic Terms NLP and Major Challenges
No ratings yet
Basic Terms NLP and Major Challenges
12 pages
Unit -1 NLP-R20
No ratings yet
Unit -1 NLP-R20
10 pages
Harmonizing Humanity and Technology
No ratings yet
Harmonizing Humanity and Technology
10 pages
5.2 Natural Language Processing
No ratings yet
5.2 Natural Language Processing
43 pages
Natural Language Processing (NLP)
No ratings yet
Natural Language Processing (NLP)
63 pages
1 - Introducntion To NLP
No ratings yet
1 - Introducntion To NLP
43 pages
NLP_PPT
No ratings yet
NLP_PPT
41 pages
NLP Notes
No ratings yet
NLP Notes
16 pages
Unit 1 Extra
No ratings yet
Unit 1 Extra
6 pages
NLP
No ratings yet
NLP
10 pages
NLP
No ratings yet
NLP
8 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
17 pages
Introduction to NLP_first_week_lecture_2st
No ratings yet
Introduction to NLP_first_week_lecture_2st
4 pages
NLP Exam Notes
No ratings yet
NLP Exam Notes
15 pages
Natural Language Processin1
No ratings yet
Natural Language Processin1
86 pages
Hadi Pres, 21-12-24-1
No ratings yet
Hadi Pres, 21-12-24-1
16 pages
Natural Language Processing
100% (1)
Natural Language Processing
6 pages
UNIT -3 NLP
No ratings yet
UNIT -3 NLP
15 pages
Large Language Models
From Everand
Large Language Models
A. Scholtens
2/5 (2)
unit-4 NLP
No ratings yet
unit-4 NLP
54 pages
NLP Notes
No ratings yet
NLP Notes
18 pages
npl12345
No ratings yet
npl12345
3 pages
Archivo - 01 (4 Cópia)
No ratings yet
Archivo - 01 (4 Cópia)
6 pages
selective topic assignmentpdf
No ratings yet
selective topic assignmentpdf
7 pages
Seminar Report1
No ratings yet
Seminar Report1
17 pages
How To Write An Outline
No ratings yet
How To Write An Outline
3 pages
X Bar Theory
No ratings yet
X Bar Theory
5 pages
11 - Chapter 3-1 PDF
No ratings yet
11 - Chapter 3-1 PDF
46 pages
Upper Intermediate S1 #20 at The Heartbreak Motel in The US: Lesson Notes
No ratings yet
Upper Intermediate S1 #20 at The Heartbreak Motel in The US: Lesson Notes
6 pages
Unit 2 - Daily Activity
No ratings yet
Unit 2 - Daily Activity
5 pages
How The Drums Talk and Parallel
70% (10)
How The Drums Talk and Parallel
5 pages
Intermediate One, Unit 8: Emergency
No ratings yet
Intermediate One, Unit 8: Emergency
9 pages
Theorgram
No ratings yet
Theorgram
33 pages
Syntax Exercises - Key
No ratings yet
Syntax Exercises - Key
14 pages
ENGLISH Week 7 Phrases
No ratings yet
ENGLISH Week 7 Phrases
10 pages
Pelaporan DSKP Bi Tahun 6
No ratings yet
Pelaporan DSKP Bi Tahun 6
14 pages
Survival Phrases S1
No ratings yet
Survival Phrases S1
161 pages
English Phrasal Verbs
No ratings yet
English Phrasal Verbs
5 pages
Predict WKST 1
No ratings yet
Predict WKST 1
4 pages
Differences Between Oxymoron and Paradox
No ratings yet
Differences Between Oxymoron and Paradox
9 pages
A Genre Approach To The Effect of Academic Questions On CLIL Students' Language Production
No ratings yet
A Genre Approach To The Effect of Academic Questions On CLIL Students' Language Production
17 pages
Understanding Complex Noun Phrases
No ratings yet
Understanding Complex Noun Phrases
6 pages
Just Enough Spanish Grammar Illustrated
97% (77)
Just Enough Spanish Grammar Illustrated
193 pages
Natural Language Processing Notes by Prof. Suresh R. Mestry: L I L L L I
No ratings yet
Natural Language Processing Notes by Prof. Suresh R. Mestry: L I L L L I
41 pages
K To 12 First Grading ENGLISH
No ratings yet
K To 12 First Grading ENGLISH
5 pages
Jungle Drums
No ratings yet
Jungle Drums
1 page
Prepositional Phrases
100% (1)
Prepositional Phrases
3 pages
Binder 1
No ratings yet
Binder 1
87 pages
Language Map - SB: Question Syllabus Vocabulary Grammar Speaking & Skills
No ratings yet
Language Map - SB: Question Syllabus Vocabulary Grammar Speaking & Skills
1 page
Advanced Grammar Structures
No ratings yet
Advanced Grammar Structures
11 pages