NLP Unit-I
NLP Unit-I
1. Lexical Analysis: Analyzing the text at the word level, including part-of-speech
tagging, named entity recognition, and morphological analysis.
2. Syntactic Analysis: Analyzing the grammatical structure of sentences, including
parsing, constituency, and dependency parsing.
3. Semantic Analysis: Extracting the meaning or semantics of text, including
semantic role labeling, word sense disambiguation, and sentiment analysis.
4. Discourse Analysis: Analyzing the organization and coherence of a larger piece
of text, including coreference resolution and coherence modeling.
5. Pragmatic Analysis: Understanding the implied meaning, context, and
intention behind the text, including speech act recognition and conversational
analysis.
1. Lexical Ambiguity: This type of ambiguity arises from words that have
multiple meanings or senses. For example, the word "bank" can refer to
a financial institution or the side of a river. Resolving lexical ambiguity
requires considering the context in which the word is used.
2. Syntactic Ambiguity: Syntactic ambiguity occurs when a sentence can
be parsed or interpreted in multiple ways due to different possible
syntactic structures. For example, consider the sentence "I saw the man
with the telescope." It can be interpreted as "I saw the man who was
holding the telescope" or "I used a telescope to see the man." Resolving
syntactic ambiguity requires understanding the relationships between
words and their syntactic roles.
3. Semantic Ambiguity: Semantic ambiguity arises when a sentence or
phrase has multiple possible interpretations based on the intended
meaning. For example, the phrase "Time flies like an arrow" can be
interpreted in different ways, such as "Time passes quickly, just like an
arrow" or "Flies, like an arrow, measure time." Resolving semantic
ambiguity requires considering the broader context and understanding
the intended meaning of the message.
4. Referential Ambiguity: Referential ambiguity occurs when pronouns or
other reference expressions lack clarity about what they refer to. For
example, in the sentence "John told Mary that he bought a car," the
pronoun "he" could refer to either John or Mary. Resolving referential
ambiguity requires identifying the antecedent or referent based on the
context.
5. Pragmatic Ambiguity: Pragmatic ambiguity arises when the intended
meaning of a statement depends on the speaker's intentions, implied
meaning, or the context of the conversation. This includes phenomena
such as irony, sarcasm, or indirect speech acts, where the literal meaning
may differ from the intended meaning. Resolving pragmatic ambiguity
often requires a deeper understanding of the social and cultural
context.
Spelling errors pose a challenge in NLP because they can affect the accuracy
and effectiveness of many natural language processing tasks, such as text
classification, information retrieval, sentiment analysis, and machine
translation.
Here are a few reasons why spelling errors can be problematic in NLP:
Let's break down the formulaic process of the Noisy Channel Model:
1. Intended Message: Let's denote the intended message as M. This is the
message that the sender wants to transmit through the channel.
2. Received Message: Let's denote the received message as R. This is the
message that the receiver actually receives, which may contain errors
or noise due to the transmission process.
3. Prior Probability: P(M) represents the prior probability of the intended
message M. It is the probability of the sender choosing the message M
to transmit.
4. Likelihood Probability: P(R | M) represents the likelihood probability of
receiving the message R given the intended message M. It is the
probability of the received message R, given that the intended message
was M. This accounts for the noise or errors introduced during
transmission.
5. Marginal Probability: P(R) represents the marginal probability of
receiving the message R. It is the probability of receiving the message R,
irrespective of the intended message. It is calculated by summing over
all possible messages M:
In this CFG, the rules define the structure of sentences in terms of subject, verb,
and object. Each rule specifies how different constituents can be combined to
form valid sentence structures. For example, rule 1 states that a sentence can
be formed by combining a subject, verb, and object. Rule 2 states that a
subject can be a noun phrase, and rule 3 defines a noun phrase as an article
followed by a noun. Similarly, rules 4, 5, and 6 define the structure of verbs,
verb phrases, and objects.
NLP systems use formal grammars like CFG to parse and analyze the syntactic
structure of sentences. By applying the rules of the grammar, the system can
determine the parts of speech, identify phrases, and establish relationships
between different constituents in a sentence. This analysis is essential for
tasks like parsing, part-of-speech tagging, and syntactic analysis.
It's worth noting that CFG is a simplified formal grammar, and there are more
advanced grammatical frameworks used in NLP, such as Dependency
Grammar, Head-Driven Phrase Structure Grammar (HPSG), and Lexical
Functional Grammar (LFG). These frameworks provide more detailed and
nuanced representations of the grammatical structure of languages,
including English.
Formal grammars, along with other linguistic resources and algorithms, serve
as the foundation for building NLP systems that can understand, generate,
and process natural language effectively.