Sentiment Analysis of Twitter Data Using

Artikel internasional

Uploaded by

Indar Kusmadi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views25 pages

Sentiment Analysis of Twitter Data Using

Artikel internasional

Uploaded by

Indar Kusmadi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Received 26 January 2025, accepted 10 February 2025, date of publication 13 February 2025, date of current version 19 February 2025.

Digital Object Identifier 10.1109/ACCESS.2025.3541494

Sentiment Analysis of Twitter Data Using

NLP Models: A Comprehensive Review
AISH ALBLADI , MINARUL ISLAM , AND CHERYL SEALS
Auburn University, Auburn, AL 36849, USA
Corresponding author: Aish Albladi ([email protected])

ABSTRACT Social media platforms, particularly Twitter, have become vital sources for understanding
public sentiment due to the rapid, large-scale generation of user opinions. Sentiment analysis of Twitter data
has gained significant attention as a method for comprehending public attitudes, emotional responses, and
trends which proves valuable in sectors such as marketing, politics, public health, and customer services.
In this paper, we present a systematic review of research conducted on sentiment analysis using natural
language processing (NLP) models, with a specific focus on Twitter data. We discuss various approaches
and methodologies, including machine learning, deep learning, and hybrid models with their advantages,
challenges, and performance metrics. The review identifies key NLP models commonly employed, such as
transformer-based architectures like BERT, GPT, etc. Additionally, this study assesses the impact of pre-
processing techniques, feature extraction methods, and sentiment lexicons on the effectiveness of sentiment
analysis. The findings aim to provide researchers and practitioners with a comprehensive overview of current
methodologies, insights into emerging trends, and guidance for future developments in the field of sentiment
analysis on Twitter data.

INDEX TERMS Sentiment analysis, natural language processing, machine learning, deep learning,
GPT, BERT.

I. INTRODUCTION advisories or outbreaks. The real-time, public-facing nature

The rapid growth of social media platforms has transformed of Twitter has also given rise to grassroots movements and has
how people communicate which makes it easier than ever provided marginalized voices with a platform for visibility
to share ideas, opinions, and information on a large scale. and advocacy [6], [7]. This dynamic has made sentiment
Among these platforms, Twitter has emerged as one of the analysis on Twitter not just a matter of academic curiosity but
most influential channels for real-time information dissem- a necessary tool for timely and informed decision-making.
ination and public discourse [1], [2], [3]. With its unique Despite its advantages, the data generated on Twitter is
format that limits posts to 280 characters, Twitter challenges often unstructured and highly variable that poses significant
users to express their thoughts succinctly [4]. Twitter become challenges to researchers attempting to draw reliable and
a valuable source for gauging public sentiment on a wide actionable conclusions.
range of topics, including political events, social movements, The exponential growth in user-generated content on
marketing campaigns, and public health crises [5]. The Twitter presents both opportunities and challenges for sen-
potential of Twitter to influence real-world events has made timent analysis, which aims to extract and classify subjective
it an essential tool for stakeholders across various sectors. information from text [8], [9], [10], [11]. As a reflection
Politicians monitor public opinion on key issues, corporations of public opinion and societal trends, Twitter data holds
track consumer sentiment to adapt their branding strate- immense potential for practical applications that span across
gies, and public health officials assess responses to health multiple fields. The ability to accurately interpret public
sentiment has proven to be a powerful tool for enhancing
The associate editor coordinating the review of this manuscript and brand management strategies, as companies leverage insights
approving it for publication was Yuan Gao . to shape their marketing efforts and improve customer
2025 The Authors. This work is licensed under a Creative Commons Attribution 4.0 License.
30444 For more information, see https://creativecommons.org/licenses/by/4.0/ VOLUME 13, 2025
A. Albladi et al.: Sentiment Analysis of Twitter Data Using NLP Models: A Comprehensive Review

engagement [12], [13]. Moreover, sentiment analysis enables as positive, negative, or neutral [9], [11], [26]. However,
policymakers to assess public opinion on policy changes conducting sentiment analysis on Twitter data is notably
and social initiatives that allows for more informed and complex due to the informal nature of the language used.
responsive governance. In the corporate sphere, under- Tweets often contain slang, abbreviations, misspellings,
standing consumer sentiment helps organizations refine emojis, hashtags, and context-dependent phrases, all of which
their products and services to improve overall customer add layers of complexity to the analysis. Moreover, Twitter
satisfaction and loyalty [14]. These extensive applications data is rife with phenomena such as sarcasm, irony, and
have driven researchers and industry practitioners to focus ambiguous expressions that challenge even advanced NLP
on advancing sentiment analysis, particularly when applied systems [20], [27]. The evolution of sentiment analysis
to the unique and dynamic nature of Twitter data. Unlike methodologies reflects the technological advancements in
traditional sources of text, Twitter posts are often informal NLP. Traditional machine learning models, such as Naïve
and filled with nuances such as slang, abbreviations, emojis, Bayes and support vector machines (SVM), initially served as
and hashtags [15], [16]. These linguistic elements, while the backbone of sentiment classification tasks [28], [29], [30].
enriching communication, present challenges for sentiment These models rely on handcrafted features and simplified text
analysis due to their variability and context-dependent representations, such as the bag-of-words approach or term
meanings [17]. Additionally, tweets often include sarcasm, frequency-inverse document frequency (TF-IDF), to identify
humor, and colloquial expressions that can obscure the patterns and infer sentiment. While effective in their time,
true sentiment conveyed that made it difficult for standard these methods often struggled to capture the contextual
models to interpret accurately [18], [19]. Twitter data poses relationships between words and were limited in their ability
unique challenges for sentiment analysis due to its brevity, to handle nuanced language.
informal language, and diverse linguistic elements. The The development of deep learning marked a significant
platform’s 280-character limit forces users to convey meaning milestone in the field of NLP. Models such as convolutional
concisely, often relying on slang, abbreviations, and hashtags. neural networks (CNNs) and recurrent neural networks
Additionally, emojis, sarcasm, and irony are frequently used, (RNNs) introduced the ability to learn hierarchical and
which can obscure the intended sentiment. Tweets often sequential representations of text data, respectively [31].
mix languages (code-switching) and contain typographical CNNs, originally designed for image processing, demon-
errors, further complicating analysis. The variability in how strated their capability to identify relevant n-grams and
sentiments are expressed makes it challenging for models features within a sentence, while RNNs, and their more
to infer polarity accurately, especially in cases of nuanced sophisticated variant, long short-term memory (LSTM)
or context-dependent sentiment. These characteristics differ- networks, proved adept at capturing dependencies across
entiate Twitter data from more structured text, necessitating longer sequences of text [32], [33], [34]. These models
advanced NLP techniques tailored to these complexities. brought about a marked improvement in the accuracy of
To address these complexities, the field has seen significant sentiment analysis tasks by considering the context of words
development and refinement of natural language processing within a sequence. The introduction of the Transformer
(NLP) techniques specifically tailored to the nuances of architecture, and subsequently models such as BERT (Bidi-
social media language [20], [21]. Early sentiment analysis rectional Encoder Representations from Transformers) and
approaches relied on basic machine learning models that, GPT (Generative Pretrained Transformer), has revolutionized
while useful, struggled to capture the deeper context and NLP by addressing the limitations of earlier models [35],
intricacies of human language. This has led to the integration [36]. The Transformer architecture, characterized by its use of
of more sophisticated NLP methodologies, including deep self-attention mechanisms as well as allows models to process
learning and transformer-based models that leverage large input text in a non-sequential manner and understand bidi-
datasets to understand context and semantic relationships rectional context. This capability enables Transformer-based
better. The evolution of NLP techniques has not only models to excel at tasks requiring an understanding of the
enhanced the accuracy of sentiment analysis but has also complex interplay between words, significantly enhancing
paved the way for real-time and large-scale analysis, which the performance of sentiment analysis systems.
is essential given the vast amount of data generated on Despite these advancements, challenges remain. The
Twitter daily [22], [23]. Refining these methods promises to effectiveness of an NLP model in sentiment analysis depends
reveal more insights from social media and make sentiment not only on the sophistication of the model itself but also on
analysis an essential tool for decision-makers across various the quality of pre-processing techniques, feature extraction
fields [24], [25]. methods, and the availability of well-annotated datasets [37].
NLP, a subfield of artificial intelligence, involves the inter- The pre-processing steps, such as tokenization, normaliza-
action between computers and human language, enabling tion, and the removal of stop words, play a crucial role in
machines to understand, interpret, and generate human preparing raw Twitter data for analysis [38]. Additionally,
language. The task of sentiment analysis requires NLP feature extraction techniques, including word embeddings
models to discern the polarity of a text, classifying it like Word2Vec, GloVe, and contextual embeddings from