0% found this document useful (0 votes)

50 views8 pages

NLP Assignment 2

The document discusses Natural Language Processing (NLP), outlining its evolution from rule-based systems to modern deep learning methods. It categorizes NLP approaches into rule-based, statistical, machine learning, and deep learning methods, highlighting their advantages and limitations. Additionally, it addresses ethical issues, societal implications, and evaluation metrics associated with NLP systems.

Uploaded by

Fro Abera

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views8 pages

NLP Assignment 2

Uploaded by

Fro Abera

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Addis Ababa Science and Technology

University
Artificial Intelligence And Robotics
Center Of Excellence
Assignment of Natural Language
Processing

Name : Firomsa Abera Keba [Document subtitle]

Id No: GSR0234/17

May, 2025
Introduction
Natural language processing is multidisciplinary field of study that deals with creating a machine
that can understand, and generate human languages. NLP is increasingly being used in interactivity
and productivity applications, such as creating spoken dialogue systems and speech-to-speech
engines, searching social networks for health or financial information, detecting moods and
emotions towards products and services, etc. [1]. Natural language processing has evolved
significantly over the years, supported by advancements in technology. It started with early rule-
based systems and, through progress, has reached the current deep learning models, taking a long
time and significant effort.

NLP Approaches
In this section we classify NLP algorithms into three main categories: rule-based methods, machine
learning methods and deep learning methods.

Rule Based Methods

In the early days, natural language processing worked using handcrafted rules and linguistic
knowledge. Rule-based methods are based on the processing of language using specific rules and
lexicons [2]. This method process texts using grammatical rules and word lists. It used a simple
form of pattern matching to respond to certain keywords and phrases, and although its capabilities
were quite limited by today’s standards, its impact was unmistakable. Rule-based system has the
following drawbacks:

➢ It requires manual effort: Creating and maintaining set of rules requires manual effort.
➢ Scalability issue: Do not scale to large datasets or evolving language, so the system must
update manually which is inefficient.
➢ Lack of generalization: They perform well on limited scope but it struggles with general
area.
➢ Limited learning capability: Its’ learning capability relies on handcrafted rules, it is
impossible to list all rules.
➢ It is rigid and difficult to customize

Statistical Methods
To address the complexities of listing all rules in a rule-based system, statistical methods have
emerged. These methods relied on large corpora of text to learn patterns and probabilities
associated with different linguistic phenomena [3]. Statistical models, such as HMM and n-gram
models, were used for tasks like language modelling, part-of-speech tagging, and machine
translation. To build a statistical NLP system, we first provide a large corpus from which the model
learns the probabilities of words. Then, the model uses these probabilities to perform further
calculations.

Limitations of Statistical methods:

➢ Data dependency: It depends on datasets.

➢ It does not capture semantic meaning or contextual nuances well
➢ It fails to give generalization for unseen areas.
➢ It fails to capture long range dependencies due to fixed window sizes.
➢ As dataset size increases, these methods become computationally expensive because it
involves calculating probabilities for a vast number of word sequences.

Machine learning models

Machine learning approaches carry out natural language tasks by learning representations and
patterns directly from data, rather than relying on handcrafted rules. It has the capability of
capturing patterns from corpus.

Supervised Learning Methods: Supervised learning involves training models using labelled data
sets. In these methods, the correct output label is known for each data sample and the model learns
to predict these labels [2].

Unsupervised Learning Methods: Unsupervised learning aims to discover hidden structures within
the data using unlabeled data sets [2].

Machine Learning has significantly transformed the field of Natural Language Processing, offering
numerous advantages over traditional rule-based approaches: it is scalable to handling vast
amounts of textual data efficiently, it is able to learn from data, models can improve over time with
more data and feedback and more. But it has some drawbacks include:

➢ Data dependency: to capture the pattern ML needs more dataset

➢ Lack of context awareness and ambiguity: the meaning of some words differs on context
➢ Domain specificity: If someone trains its model on specific domains to customize to
another domain it remains challenge.
➢ Bias and Fairness Issues: ML models inherit biases from their training data.

Deep Learning Methods

Deep learning methods are advanced algorithms that perform complex language processing tasks
using artificial neural networks [2]. Neural networks, especially RNNs and LSTM, were widely
used in earlier NLP applications, helping with tasks like machine translation and text generation
[4].

Transformer Models
Transformer models, such as BERT (Bidirectional Encoder Representations from Transformers)
and GPT (Generative Pretrained Transformer), leverage an attention mechanism that enables
models to weigh the importance of different words in a sentence, regardless of their position [4].
BERT
BERT has transformed the field by introducing bidirectional context, meaning it can understand
words in relation to all surrounding words in a sentence [4].

GPT
By pre-training on large corpora of text data, GPT can generate coherent and contextually relevant
text, making it suitable for applications like content creation, dialogue generation, and code
synthesis [4].

Deep learning has improved drawbacks of traditional machine learning models by capable of
capturing syntactic and semantic relationships, it reduces tasks of feature engineering by learning
features from raw text without manual engineering, pre-trained language models can be fine-tuned
for specific tasks with less labeled data. However, it has drawbacks:

➢ It requires massive amount of data and high computational hardware.

➢ Deep models act as "black boxes," making it difficult to interpret or explain predictions.
➢ Risk of significant bias
➢ High energy consumption and other.

Evaluation metrics in NLP

Evaluation metrics are crucial for assessing the performance of models across different NLP tasks.
These evaluation indices assist researchers in picking the most appropriate model for their research
circumstance [5]. The error rate is the proportion of misclassified samples to the total number of
samples. Precision, recall, and F1 scores may be computed from below confusion matrix

Actual Positive Actual Negative

Predicted Positive True Positive (TP) False Positive (FP)
Predicted Negative False Negative (FN) True Negative (TN)

➢ Accuracy: It is the ratio of correctly estimated samples to the total number of samples. It is
particularly useful in balanced data sets [2]. It is calculated as follows
𝑇𝑃+𝑇𝑁
Acc = (𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁)

➢ Precision: It is the ratio of correct positive predictions to total positive predictions. High
𝑇𝑃
precision indicates few false positives [2]. It is calculated as follows Precision = (𝑇𝑃+𝐹𝑃)

➢ Recall is the number of positive instances in the sample that were predicted to be correct.
High recall indicates catching most of the true positives [2]. it is calculated as follows =>
𝑇𝑃
Recall = (𝑇𝑃+𝐹𝑁)
➢ The F-measure score assesses distinct precision/recall preferences. It provides a balanced
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗𝑅𝑒𝑐𝑎𝑙𝑙
evaluation by considering both precision and recall. F = 2 ∗ (𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗𝑅𝑒𝑐𝑎𝑙𝑙)

➢ ROUGE and BLEU Scores: ROUGE (Recall-Oriented Understudy for Gisting Evaluation)
and BLEU (Bilingual Evaluation Understudy) are metrics used in tasks such as text
summarization and translation. ROUGE measures how good summarizations are, while
BLEU assesses translation quality [2].

Limitations and Risks Associated with NLP Systems

There are several limitations and risks that need to be considered while using NLP systems.

Quality data: The effectiveness of the computational models relies on the quality and
comprehensiveness of the data. Although many political discourses are public, including data
sources such as news, press releases, legislation, and campaigns, when it comes to surveying public
opinions, social media might be a biased representation of the whole population [6].

Black box: many modern NLP models have black box nature; We do not encourage decision-
making systems to depend fully on NLP, but suggest that NLP can assist human decision-makers
[6].

Linguistic and Contextual Challenges: Since human languages are ambiguous, it is challenging
to NLP models to understand the contextual meanings of human languages.

Those above-mentioned limitation may arise several risks:

Misinformation and Fake Content: NLP systems, especially generative models, can be misused
to create fake news, phishing emails, or deepfake text.

Accountability: When NLP systems make incorrect or harmful decisions, it’s unclear who is
responsible — the developer, the deployer, or the data provider. And many more risks.

Ethical Issues in Natural Language Processing

Privacy Issues
Many natural language processing methods rely on user data to obtain better performance; in some
case it poses potential risks when that data is sensitive data. That data needs to be secured or not
to shared in public. Since natural language processing tasks usually involve collecting and
processing sensitive user information, appropriate privacy protection measures must be taken to
protect users' privacy [7].
Bias Issues
Bias refers to systematic and unfair discrimination that embedded in dataset, models, or algorithms
that affect the performance of NLP systems. These biases may arise from multiple sources [8]:

➢ Data bias: is due to an imbalance in training data.

➢ Model bias: is due to the limitations of the model, such as the chosen algorithm or model
architecture.
➢ Assessment bias is due to the choice of assessment metrics or the limited nature of the
assessment dataset.
➢ Algorithm bias is due to the design or selection of a particular algorithm or process.
➢ Social bias: is due to the fact that NLP systems are designed and applied to specific social
contexts and thus may reflect or enhance real-world inequalities.

In order to mitigate these biases, we can use several approaches, such as data augmentation, model
adaptation, redefinition of assessment metrics, algorithm improvement, and social engagement [7].

Misinformation: Because of, NLP systems relay on data, that data may have biases, which makes
later generation of misinformation. For example, if someone trains models on social media data,
mostly social media data have hated speech and more fake information which makes the generation
of our model biased.

Societal Implications of Natural Language Processing

Enhancement of technology on language processing provide many benefits to society, but it has
also introduced certain challenges and negative consequences.

Accessibility: NLP enhances the digital accessibility for individuals. For example, it provides
Text-to-speech for those who have visual impairments and Speech-to-text for peoples with hearing
impairments [9]. And also, for peoples with physical disabilities, it helps them with making
machines ability to taking orders from them using natural language. Machine translation helps
individuals who do not speak multiple languages by automatically translating content from one
language to another and many more things [10].

Impact on Communication: The emerging of NLP changed the way people communicate. NLP
applications, such as chatbots, autocomplete, and virtual assistants, influence the way people
communicate. For example, now a time writing skill is not a challenge because autocorrect and
predictive text helps reshaping sentence construction and vocabulary usage. But it also has
drawbacks, for instance, they can reduce the need to develop strong writing and grammar skills,
especially among younger users who rely heavily on automation.
Impact on Culture: NLP systems are heavily dependent on large datasets (corpora), which are
often sourced from specific regions or dominant cultures. As a result: The cultural values and
norms embedded in these datasets can create algorithmic bias that favors certain cultures and
worldviews. This cultural and linguistic bias may contribute to a form of digital colonialism, where
the cultural dominance of a few groups is perpetuated and amplified by AI technologies
References

[1] A. P. S. Мaria А. Kazakova, "Analysis of natural language processing technology: modern problems,"
2022.

[2] A. ARISOY, "NATURAL LANGUAGE PROCESSING ALGORITHMS AND PERFORMANCE COMPARISON,"

2024.

[3] O. (. Masoumzadeh, "From Rule-Based Systems to Transformers: A Journey through the Evolution
of Natural Language Processing," 2023.

[4] A. V. Christopher Sola, "Advanced Natural Language Processing," 2025.

[5] M. S. H. T. A. K. J. Abdul Ahad ABRO, "Natural Language Processing Challenges and Issues," Journal
of Science, 2023.

[6] R. M. Zhijing Jin, "Natural Language Processing for Policymaking," 2023.

[7] Y. Ma, "A Study of Ethical Issues in Natural Language Processing with Artificial Intelligence," Journal
of Computer Science and Technology Studies , 2023.

[8] P. S. Hovy. D., "Five sources of bias in natural language processing.," Language and Linguistics
Compass, 2021.

[9] T. V. K. P. k. V. Madhusudhana Reddy, "Speech-to-Text and Text-to-Speech Recognition using Deep

Learning," in Proceedings of the Second International Conference on Edge Computing and
Applications (ICECAA 2023), 2023.

[10] H. W. Z. H. L. H. K. W. C. Haifeng Wang, "Progress in Machine Translation," 2022.

Unit I - Natural Language Processing
No ratings yet
Unit I - Natural Language Processing
34 pages
Unit 5
No ratings yet
Unit 5
107 pages
NLP Unit 1 and 2
No ratings yet
NLP Unit 1 and 2
106 pages
Unit 5 Updated
No ratings yet
Unit 5 Updated
107 pages
Unit - 1
No ratings yet
Unit - 1
55 pages
Eco 36
No ratings yet
Eco 36
6 pages
Unit 1
No ratings yet
Unit 1
99 pages
Course Code HUM1012 Logic and Language Structure BL202425040 0921 D21+D22
No ratings yet
Course Code HUM1012 Logic and Language Structure BL202425040 0921 D21+D22
55 pages
Artificial Intelligence-UNIT-4
No ratings yet
Artificial Intelligence-UNIT-4
37 pages
NLP Notes Unit 1to5 Final
No ratings yet
NLP Notes Unit 1to5 Final
75 pages
NLP LectureNotes UNIT 1
No ratings yet
NLP LectureNotes UNIT 1
55 pages
NLP Basics for Computer Science Students
No ratings yet
NLP Basics for Computer Science Students
87 pages
Natural Language Processing - Bridging The Gap Between Humans and Machines
No ratings yet
Natural Language Processing - Bridging The Gap Between Humans and Machines
6 pages
Archivo - 01 (3 Cópia)
No ratings yet
Archivo - 01 (3 Cópia)
5 pages
NLP UNIT-1(s)
No ratings yet
NLP UNIT-1(s)
25 pages
Unit 1
No ratings yet
Unit 1
20 pages
Introduction To Data Science - Week 7 - LAQ's
No ratings yet
Introduction To Data Science - Week 7 - LAQ's
4 pages
Hocken Maier 25
No ratings yet
Hocken Maier 25
46 pages
Module 1
No ratings yet
Module 1
39 pages
UNIT - 03 (All Topics)
No ratings yet
UNIT - 03 (All Topics)
54 pages
NLP Unit 1
No ratings yet
NLP Unit 1
46 pages
1 NLP
No ratings yet
1 NLP
26 pages
NLP Notes For Students
75% (4)
NLP Notes For Students
18 pages
NLP - AI2214601 Unit 1to Unit 5 Notes
No ratings yet
NLP - AI2214601 Unit 1to Unit 5 Notes
98 pages
NLP Unit-1-Introduction-And-Word-Level-Analysis
No ratings yet
NLP Unit-1-Introduction-And-Word-Level-Analysis
25 pages
NLP Assignment
No ratings yet
NLP Assignment
10 pages
Module I NLP
No ratings yet
Module I NLP
65 pages
Unit 1 and 2
No ratings yet
Unit 1 and 2
78 pages
6CS4 AI Unit-5
No ratings yet
6CS4 AI Unit-5
65 pages
Topic 2: Introduction To Natural Language Processing (NLP)
No ratings yet
Topic 2: Introduction To Natural Language Processing (NLP)
16 pages
NLP Unit-1 Merged
No ratings yet
NLP Unit-1 Merged
41 pages
NLP Module 1
No ratings yet
NLP Module 1
31 pages
Lect 01
No ratings yet
Lect 01
28 pages
Unit1 A
No ratings yet
Unit1 A
8 pages
Basic Terms NLP and Major Challenges
No ratings yet
Basic Terms NLP and Major Challenges
12 pages
Unit-1 Aim 502
No ratings yet
Unit-1 Aim 502
15 pages
SNLP - 1
No ratings yet
SNLP - 1
11 pages
Introduction NLP
No ratings yet
Introduction NLP
32 pages
Origins and Challenges of NLP
No ratings yet
Origins and Challenges of NLP
16 pages
NLP Unit-1-Introduction-And-Word-Level-Analysis NLP Unit-1-Introduction-And-Word-Level-Analysis
No ratings yet
NLP Unit-1-Introduction-And-Word-Level-Analysis NLP Unit-1-Introduction-And-Word-Level-Analysis
26 pages
Seminar Title: Natural Language Processing: Understanding and Generating Human Language
No ratings yet
Seminar Title: Natural Language Processing: Understanding and Generating Human Language
20 pages
DLNLP Chapter-1
No ratings yet
DLNLP Chapter-1
38 pages
Unit I - NLP
No ratings yet
Unit I - NLP
24 pages
Akchukwu Wisdom Chidi Seminar Corrected Version
No ratings yet
Akchukwu Wisdom Chidi Seminar Corrected Version
17 pages
NLP Levels
No ratings yet
NLP Levels
8 pages
CSC 528 Lecture 3
No ratings yet
CSC 528 Lecture 3
42 pages
Languages: What Is Natural Language Processing ?
No ratings yet
Languages: What Is Natural Language Processing ?
25 pages
7 MachineLearningBasics
No ratings yet
7 MachineLearningBasics
46 pages
Notes MSC NLP
No ratings yet
Notes MSC NLP
36 pages
Natural Language Processing Tools and Approaches
No ratings yet
Natural Language Processing Tools and Approaches
106 pages
Lect36 Tasks
No ratings yet
Lect36 Tasks
115 pages
NLP Phases and Challenges Explained
No ratings yet
NLP Phases and Challenges Explained
60 pages
Introduction To Natural Language Processing
No ratings yet
Introduction To Natural Language Processing
21 pages
NLP Introduction
No ratings yet
NLP Introduction
36 pages
Natural Language Processing: John Doe CEO
No ratings yet
Natural Language Processing: John Doe CEO
16 pages
Unit-I NLP
No ratings yet
Unit-I NLP
15 pages
An Introduction To Deep Learning in Natural Language Processing
No ratings yet
An Introduction To Deep Learning in Natural Language Processing
14 pages
Natural Language Processing 5
No ratings yet
Natural Language Processing 5
24 pages
NLPX
No ratings yet
NLPX
3 pages
Cambridge Flyers Speaking LP Table Part3
No ratings yet
Cambridge Flyers Speaking LP Table Part3
2 pages
ENGLISH
No ratings yet
ENGLISH
10 pages
Copia de LESSON PLAN
No ratings yet
Copia de LESSON PLAN
6 pages
English Project (Determiners)
No ratings yet
English Project (Determiners)
5 pages
Micro and Macro Skills of Reading Comprehension Acquired by Efl Students
No ratings yet
Micro and Macro Skills of Reading Comprehension Acquired by Efl Students
8 pages
Program Overview
No ratings yet
Program Overview
16 pages
Latin For Beginners
No ratings yet
Latin For Beginners
233 pages
Group 5 - Language and Sex
No ratings yet
Group 5 - Language and Sex
12 pages
PREDICATES, REFERRING EXPRESSIONS, AND UNIVERSE OF DISCOURSE (Mu'anifah Hanim)
100% (1)
PREDICATES, REFERRING EXPRESSIONS, AND UNIVERSE OF DISCOURSE (Mu'anifah Hanim)
2 pages
Preschool3 Week 3 Term 3 - Lesson Plan
No ratings yet
Preschool3 Week 3 Term 3 - Lesson Plan
16 pages
Year 9 Term 1 Assessment Piece 2024 (Australian Poetry)
No ratings yet
Year 9 Term 1 Assessment Piece 2024 (Australian Poetry)
6 pages
Eng 221
No ratings yet
Eng 221
161 pages
09 連 A 都B
No ratings yet
09 連 A 都B
4 pages
04mindset2 RevisionWorksheet4
100% (1)
04mindset2 RevisionWorksheet4
2 pages
Grade 3 English Plural Nouns Lesson Plan
No ratings yet
Grade 3 English Plural Nouns Lesson Plan
5 pages
A Study of Water and Fire As Metaphors in American and Vietnamese Short Stories of The 20th Century
No ratings yet
A Study of Water and Fire As Metaphors in American and Vietnamese Short Stories of The 20th Century
3 pages
Universität Duisburg-Essen: Albert Weidemann
No ratings yet
Universität Duisburg-Essen: Albert Weidemann
18 pages
Mixed Bag
No ratings yet
Mixed Bag
8 pages
Vocabulary Fundamentals, G1 - Answer
67% (3)
Vocabulary Fundamentals, G1 - Answer
18 pages
Guyanese - English Creole
100% (1)
Guyanese - English Creole
22 pages
1st PT English
No ratings yet
1st PT English
5 pages
First Module Test - Lengua Extranjera 5
No ratings yet
First Module Test - Lengua Extranjera 5
2 pages
Argumentative Speech Rubric
No ratings yet
Argumentative Speech Rubric
3 pages
DLL Matatag - Reading&literacy 1 - Q2 - W7
No ratings yet
DLL Matatag - Reading&literacy 1 - Q2 - W7
37 pages
Unit 2A - Are You On Holiday - Handout
No ratings yet
Unit 2A - Are You On Holiday - Handout
4 pages
Adult English Grammar Basics
No ratings yet
Adult English Grammar Basics
54 pages
Lesson Plan
No ratings yet
Lesson Plan
3 pages
ENGLISH10Q2MODULE2
No ratings yet
ENGLISH10Q2MODULE2
35 pages
A1 Week 14-15
No ratings yet
A1 Week 14-15
46 pages
Final Exam: Vocabulary and Grammar Test
No ratings yet
Final Exam: Vocabulary and Grammar Test
3 pages

NLP Assignment 2

Uploaded by

NLP Assignment 2

Uploaded by

Addis Ababa Science and Technology

Name : Firomsa Abera Keba [Document subtitle]

Rule Based Methods

Limitations of Statistical methods:

➢ Data dependency: It depends on datasets.

Machine learning models

➢ Data dependency: to capture the pattern ML needs more dataset

Deep Learning Methods

➢ It requires massive amount of data and high computational hardware.

Evaluation metrics in NLP

Actual Positive Actual Negative

Limitations and Risks Associated with NLP Systems

Those above-mentioned limitation may arise several risks:

Ethical Issues in Natural Language Processing

➢ Data bias: is due to an imbalance in training data.

Societal Implications of Natural Language Processing

[2] A. ARISOY, "NATURAL LANGUAGE PROCESSING ALGORITHMS AND PERFORMANCE COMPARISON,"

[4] A. V. Christopher Sola, "Advanced Natural Language Processing," 2025.

[6] R. M. Zhijing Jin, "Natural Language Processing for Policymaking," 2023.

[9] T. V. K. P. k. V. Madhusudhana Reddy, "Speech-to-Text and Text-to-Speech Recognition using Deep

[10] H. W. Z. H. L. H. K. W. C. Haifeng Wang, "Progress in Machine Translation," 2022.

You might also like