0% found this document useful (0 votes)

36 views14 pages

Samplepaper

Uploaded by

sergiulimboi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views14 pages

Samplepaper

Uploaded by

sergiulimboi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Contribution Title?

Sergiu Limboi and Laura Dioşan

Babeş-Bolyai University, Faculty of Mathematics and Computer Science,

Cluj-Napoca, Romania

Abstract. The Twitter platform is one of the most popular social me-
dia environments that gathers concise messages regarding the topics of
the moment expressed by its users. The valuable information that is ex-
tracted from tweets can be applied in many areas or activities and the
process of Twitter Analysis can be defined as an activity from the Text
Mining domain. Processing sentiments from tweets is a challenging task
due to the natural language complexity, misspelling and short forms of
words. Sentiment Analysis is a field that identifies emotional information
into various polarity classes (positive, negative and neutral). The results
and the classification can be used in strategic and managerial decision
making activity.
The goal of this article is to present a variety of Sentiment Analysis
approaches, focused on information from social media, especially Twit-
ter. The baseline perspective is presented based on different scenarios
that take into consideration preprocessing techniques, data representa-
tions, methods and evaluation measures. In addition, two interesting ap-
proaches are detailed described: the hashtag-based and the synset one.
All these methods are highlighted in order to prove the high importance
and impact that analysis of tweets has on social studies and society in
general.

Keywords: Sentiment Analysis, Twitter, Hashtags

1 Introduction

Nowadays there is an increased interest from various areas like politics, business,
economy or marketing to find answers to questions regarding people’s opinions
and feelings. Some of the questions can be ”What do they hate about iPhone
6?”, ”Does it worth to watch this movie?”. This interest leads to the analysis of
social media content which is very useful in activities like opinion mining from
product reviews or sentiment polarity task.
Twitter is considered ”a valuable online source for opinions” [9] and shows a
way to catch public’s ideas and interests for social studies.
Bearing in mind all these questions, it arrives at the point when it needs a
methodology or area that can solve these issues and can help people to analyze
the context in order to understand it. Therefore, an interesting domain is defined
?
Supported by organization x.
2 S. Limboi, [Link]̧an

for all these problems and it is called Sentiment Analysis. Sentiment Analysis [1]
is the field of Natural Language Processing that identifies and extracts opinions
from written or spoken language. Other related tasks with it are: information
extraction (it implies the removal of subjective information), question answering
(it means to identify opinion-oriented inquiries) and summarization (it reflects
a representation of the original text).-TO BE RESTRUCTURED
Paper structure- TO BE COMPLETED

2 Twitter Environment

Twitter is a popular social media for communicating with other people, ex-
pressing feelings and opinions and broadcasting news. The advantages of such
a powerful tool are: the availability on different electronic devices, the opportu-
nity to have a large friend pool and the fact that you can send small and concise
messages (called tweets) to other friends and on a variety of subjects [8]. There
is a challenge to gather all relevant data, detect and summarize news on a spe-
cific topic. For a user seems to be a problem to find other users with interesting
tweets due to the fact that it has to read through status updates and follow
links attached to the tweet in order to obtain more information. The analysis of
Twitter is a research area with high and growing interest due to the fact that
some research problems are poorly defined and new difficulties are described day
by day. In recent years, researchers have focused on issues like event detection,
topic mining or sentiment analysis.
The main concepts that are used in a Twitter environment are: url, mention,
user, friend, follower, tweet, hash-tag and re-tweet [7]. An url is a link that
reflects information about a specific topic from the message posted on Twitter.
A user is a person or a system that can posts messages on Twitter [8]. This
social media defines a friend-follower relationship. For example, let consider two
users a and b. The user a has the option to receive all the tweets written by user
b. So, b becomes friend of a and a is a follower of b. The vice-versa relation
is not mandatory, because user b is not forced to receive messages from user a.
Also, a user is defined by several properties: name, source location, list of friends
and followers, number of tweets, photo and a short description. In the Twitter
background there are two main abbreviations: RT and DM. RT means re-tweet,
so posting again a message by a specific user and DM signifies Direct Message
when you want to send a message to the user.
A tweet is composed of two main parts: a short and simple text message
(maximum 140 characters) which is posted on the social media and hash-tags and
hypertext-based elements: related media (maps, photos, videos) and websites.
Hash-tags [8] are keywords prefixed with the ”#” symbol that can appear in a
tweet. Twitter users use this notation to categorize their messages and enable
or mark them to be more easily found in search.
Contribution Title 3

3 Sentiment Analysis
The features of Sentiment Analysis can be the following:
– scalability: despite the fact that there is a big amount of textual information,
Sentiment Analysis can handle it and can process data at scale in an ”efficient
and cost-effective way” [1]
– real-time analysis: it helps to define new strategies and analysis for current
problems (e.g. Is there an angry client? Is a famous presidential candidate
going to lose the competition?)
– detects changes in people’ opinions: applying a Sentiment Analysis process
during different periods of time we can detect relevant changes about a
product, service or another topic
– tracking client’s satisfaction
– it helps to improve business marketing
Around Sentiment Analysis area can be defined multiple concepts like opin-
ion, polarity, subject or opinion holder. An opinion is an expression that reflects
people’s feelings about a specific topic. This expression can take various forms:
written or text-based, spoken or voice-based and behavior or gesture-based Sen-
timent Analysis can be modeled as a classification problem that implies subjec-
tivity (classify opinions into subjective and objective ones) and polarity (classify
expressions into negative, positive and neutral), based on input represented as
textual information (documents, messages, etc). In this paper we focus on po-
larity classification task. The subject is the object that people talked about. It
can be a product, a service, an event or a famous politician. An opinion holder
is the person that expresses the sentiment about a topic.
The complex process of classifying polarity opinions can be applied on dif-
ferent levels [6]:
– document level: the whole document or text is treated as a single piece
of information. This approach fits when the document refers to only one
subject/ topic
– sentence-level: A sentence is the unit of information and for each sentence is
determined the polarity.
– word-level: it is the most fain-grained analysis.

3.1 Phases
Sentiment Analysis, visualized as a polarity classification task (detecting opin-
ions and classifying them into positive, negative or neutral), implies the next
phases, suggested in Figure 1.
The initialization step prepares the data for the classification algorithm. Data
collection means to retrieve data and analyze the content of it (How many mes-
sages are positive, negative, neutral? Is the data balanced?). If the data is not
already labeled, a manually annotation is required in order to validate the ap-
proach. The preprocessing phase means to transform the unstructured informa-
tion into a clear one without misspellings, abbreviations or slang words. Then,
4 S. Limboi, [Link]̧an

Fig. 1. Sentiment Analysis Phases

an attribute selection stage determines how data is represented in terms of rel-

evant features. The output of the initialization step will be the input for the
learning step. The need of a learning phase is mandatory because the systems
need a supervisor or a trainer that can tell it which is the expected output for
the given input. Text preprocessing and analyzing sentiments from it represent a
very hard task and they cannot be done without an automatic component (Ma-
chine Learning component). Here, the training model is passed to a Machine
Learning algorithm that will classify messages into different classes. The last
phase is the evaluation when classification is applied on a test dataset and per-
formance measures are computed in order to reflect how good is the Sentiment
Analysis methodology.

Preprocessing techniques The preprocessing step is very important for the

polarity classification task by providing clean and relevant information for next
phases.
Cleaning operations are those that involve only to normalize words from
textual information and to remove the disadvantages given by the free way of
writing opinions. Therefore, can be applied the following methods that are used
to define a uniform text [3]:

– removal of urls, hashtags from messages.

– remove numbers and special characters.
– replace repeated sequences with only one.
– remove blank spaces.
– lowercasing: all letters are converted into lower ones.
Contribution Title 5

Negation is an essential operation because negative words influence a lot the

polarity of a message. Ignoring negation is one of the causes for misclassification.
So, all negative words (can’t, don’t, never) are replaced with not. Dictionary
approach means to convert slang words or abbreviations with their formal forms.
Then, a word like ”l8” will be converted to ”late” [3] or ”approx.” to ”approxi-
mately”. The use of it reduces the noise in the dataset. Removing stop words
is very important and it can improve the performance of the classifier. Stop
words can be pronouns or articles (e.g. our, me, myself, that, because, etc).
Stemming means to reduce the inflective or derivation form to a common
radix of it (e.g. cars become car). Basically, stemming implies to cut off the
prefixes or suffixes of a word. A flavor of stemming is lemmatization that works
with morphological context based on dictionaries (e.g. studies is converted into
study).

Attributes determination In this phase, the relevant features from text are
extracted and used as inputs for the classification algorithms. Text mining deals
with several attributes, but only some of them are frequently used in the context
of tweets (e.g. hash-tags). Document-term matrix representation reflects the
frequency of extracted words from a collection of documents (texts, sentences).
In this paper are designed several granularities for the document-term matrix
representation: word and bi-gram granularity. In other words, each row from
the matrix corresponds to the document and columns represent the granularity
level(word/bi-gram). Bi-gram granularity means a list of unordered words of
size 2.
Also, two scenarios are defined for determining the values from the matrix.
The value from each cell can be integer or real. In the case of integer values, it is
described the frequency of the granularity level in the document. The real values
are computed considering TF (term frequency) and IDF (inverse document
frequency) formula:
m
T F (t) =
M
and
N
IDF (t) = log( ),
n
where m is the number of times term t appears in the document, M is the
number of terms in the document, N is the number of documents and n is the
number of documents where term t appears [2]

Machine learning algorithms and performance measures For the clas-

sification task can be applied various techniques like lexicon-based, machine
learning methods or hybrid approaches. For this paper the focus is on machine
learning algorithms. So, we will briefly describe several classification methods
[5]:
– Naı̈ve Bayes is a probabilistic algorithm based on Bayes theorem. For each
word will be computed the probability of belonging to a class.
6 S. Limboi, [Link]̧an

– Support Vector Machine (SVM) is a deterministic algorithm used for finding

a hyperplane that separates the data input in two classes, each of one side
of it.
– Logistic regression is a statistical classifier based on a logistic function.

In terms of performance measures, accuracy, precision, recall and f-score can

be computed in order to indicate which algorithm fits the best.

4 Baseline Sentiment Analysis (BSA)

Several experiments were conducted focusing on preprocessing techniques, data

representation in terms of relevant extracted features and Machine Learning
algorithms that were involved in the classification process.

4.1 Dataset

For the following methodologies, it is used Sanders dataset [4]. It is consisted of

5113 annotated tweets: 519 marked as positive, 572 negative, 2333 neutral and
1689 irrelevant. These tweets are messages related to four main topics, important
companies from world, and they are Twitter, Apple, Google and Microsoft. The
focus is to classify messages into positive and negative ones. Therefore, irrelevant
and neutral messages are removed and 1091 tweets are considered for the process.
20% were used as testing dataset and the rest of them for training.

4.2 Data preprocessing

Data preprocessing is an essential step in the Sentiment Analysis domain because

it can improve the whole process by removing the disadvantages or problems
reflected by the free writing style of microblogs (e.g. misspellings, use of slang
words, abbreviations). The following techniques were applied on Sanders dataset
due to the fact that it is desired to have a fast and simple procedure:

– cleaning operations: removal of punctuation, lowercasing

– removal of stop words
– stemming

4.3 Data representation

After the preprocessing phase, the focus is to represent the tweets for the clas-
sification algorithms. Consequently, document-term matrix is built considering
the granularity levels (word and bi-gram) and the way of computing the values:
determining the frequency of words/bi-grams and the [Link] computation, ex-
plained in the previous section.
Contribution Title 7

4.4 Classification algorithms and evaluation measures

As classification algorithms, Naı̈ve Bayes (NB), Support Vector Machine (SVM)

and Logistic regression (LR) are used for the BSA. For the evaluation phase, ac-
curacy and precision are computed. As parameters, for the classification task, for
logistic regression, the inverse of regularization strength parameter is considered
having the value 1.5. The SVM classifier with linear kernel and regularization
parameter (set to value 1.0 ) is also applied for the proposed approach. Last but
not least, the Multinomial version of Naı̈ve Bayes is used for the classification
task.
Data for all experiments 786 messages (we removed the messages without
hashtags): 415 positive and 371 negative.

4.5 Text without hashtags

Table 1. Accuracy for BSA with word granularity and frequency values

Preprocessing technique NB LR SVM

Without preprocessing 59.49% 57.59% 60.13%
Removal of punctuation 77.85% 73.42% 70.89%
Removal of stop words 58.23% 58.86% 55.7%
Lowercasing 65.19% 58.23% 57.59%
Stemming 59.49% 57.59% 60.13%
All 55.06% 56.96% 56.96%

Table 2. Precision for BSA with word granularity and frequency values

Preprocessing technique NB LR SVM

Without preprocessing 63.15% 60.75% 61.79%
Removal of punctuation 81.81% 76.92% 69.38%
Removal of stop words 62.85% 62.66% 57.95%
Lowercasing 68.83% 61.25% 59.55%
Stemming 63.15 % 60.75% 61.79%
All 56.70% 58.16% 58.00%

Word granularity and bigram granularity (frequency and tf-idf values)

Due to the short length of messages (maximum 140 characters), applying all pre-
processing techniques seems to be a bad operation. Best values are achieved when
8 S. Limboi, [Link]̧an

Table 3. Accuracy for BSA with word granularity and tf-idf values

Preprocessing technique NB LR SVM

Without preprocessing 61.39% 60.13% 65.19%
Removal of punctuation 74.68% 73.42% 73.42%
Removal of stop words 61.39% 61.39% 65.19%
Lowercasing 59.49% 57.59% 63.29%
Stemming 61.39% 60.13% 65.19%
All 51.27% 53.80% 53.80%

Table 4. Precision for BSA with word granularity and tf-idf values

Preprocessing technique NB LR SVM

Without preprocessing 60.74% 62.65% 69.86%
Removal of punctuation 74.44% 75.60% 72.34%
Removal of stop words 61.11% 65.33% 68.83%
Lowercasing 57.93% 59.77% 65.85%
Stemming 60.74% 62.65% 69.86%
All 52.63% 55.91% 56.04%

Table 5. Accuracy for BSA with bi-gram granularity and frequency values

Preprocessing technique NB LR SVM

Without preprocessing 59.49% 58.86% 59.49%
Removal of punctuation 76.58% 75.32% 75.95%
Removal of stop words 60.13% 60.13% 63.29%
Lowercasing 61.39% 63.29% 62.66%
Stemming 58.86% 58.23% 61.39%
All 47.57% 56.33% 57.59%

Table 6. Precision for BSA with bi-gram granularity and frequency values

Preprocessing technique NB LR SVM

Without preprocessing 63.51% 63.01% 63.88%
Removal of punctuation 80.51% 78.48% 75.55%
Removal of stop words 64.78% 65.67% 69.11%
Lowercasing 65.33% 67.56% 65.43%
Stemming 62.66 % 62.50% 65.33%
All 49.09% 58.82% 60.49%
Contribution Title 9

Table 7. Accuracy for BSA with bi-gram granularity and [Link] values

Preprocessing technique NB LR SVM

Without preprocessing 59.49% 62.66% 60.76%
Removal of punctuation 74.05% 73.42% 72.78%
Removal of stop words 61.39% 62.03% 60.13%
Lowercasing 56.96% 62.66% 61.39%
Stemming 59.49% 61.39% 60.13%
All 45.57% 55.06% 55.06%

Table 8. Precision for BSA with bi-gram granularity and [Link] values

Preprocessing technique NB LR SVM

Without preprocessing 61.36% 66.66% 64.86%
Removal of punctuation 74.15% 75.60% 72.52%
Removal of stop words 62.36% 66.21% 66.15%
Lowercasing 58.33% 67.12% 64.93%
Stemming 61.36 % 65.33% 64.78%
All 49.21% 57.30% 57.64%

removal of punctuation is applied. Also, for pure textual information the qual-
ity of Machine Learning techniques is decreased due to the removal of hashtags
which bring valuable information.

5 Hashtag-based Sentiment Analysis (HSA)

5.1 Hash pur - doar hash-taguri

Word granularity- integer values (frequencies)

Table 9. Accuracy for HSA with word granularity and frequency values

Preprocessing technique NB LR SVM

Without preprocessing 66.46% 65.19% 65.82%
With preprocessing (removal of punctuation) 65.19% 69.62% 66.46%

Bigram granularity
10 S. Limboi, [Link]̧an

Table 10. Precision for HSA with word granularity and frequency values

Preprocessing technique NB LR SVM

Without preprocessing 73.84% 68.83% 68.29%
With preprocessing (removal of punctuation) 71.01% 73.07% 65.80%

Table 11. Accuracy for HSA with word granularity and tf-idf values

Preprocessing technique NB LR SVM

Without preprocessing 64.65% 63.29% 66.46%
With preprocessing (removal of punctuation) 64.56% 68.35% 67.09%

Table 12. Precision for HSA with word granularity and tf-idf values

Preprocessing technique NB LR SVM

Without preprocessing 69.44% 65.11% 68.67%
With preprocessing (removal of punctuation) 67.94% 70.73% 71.05%

Table 13. Accuracy for HSA with bigram granularity and frequency values

Preprocessing technique NB LR SVM

Without preprocessing 64.56% 67.09% 65.82%
With preprocessing (removal of punctuation) 63.29% 63.29% 65.19%

Table 14. Precision for HSA with bigram granularity and frequency values

Preprocessing technique NB LR SVM

Without preprocessing 68.91% 70.51% 69.73%
With preprocessing (removal of punctuation) 67.10% 65.11% 69.86%

Table 15. Accuracy for HSA with bigram granularity and [Link] values

Preprocessing technique NB LR SVM

Without preprocessing 66.46% 66.46% 65.19%
With preprocessing (removal of punctuation) 65.19% 63.92% 62.03%
Contribution Title 11

Table 16. Precision for HSA with bigram granularity and [Link] values

Preprocessing technique NB LR SVM

Without preprocessing 71.83% 70.12% 68.83%
With preprocessing 67.46% 68.00% 66.21%

Table 17. Accuracy for impure text with word granularity and frequency values

Preprocessing technique NB LR SVM

Without preprocessing 63.29% 67.09% 65.19%
With preprocessing (removal of punctuation) 79.75% 75.32% 71.52%

Table 18. Precision for impure text with word granularity and frequency values

Preprocessing technique NB LR SVM

Without preprocessing 68.57% 68.18% 68.35%
With preprocessing (removal of punctuation) 84.21% 78.48% 75.32%

Table 19. Accuracy for impure text with word granularity and tf-idf values

Preprocessing technique NB LR SVM

Without preprocessing 63.29% 65.82% 68.35%
With preprocessing (removal of punctuation) 80.38% 76.58% 77.22%

Table 20. Precision for impure text with word granularity and tf-idf values

Preprocessing technique NB LR SVM

Without preprocessing 63.2% 68.18% 68.35%
With preprocessing (removal of punctuation) 79.121% 77.64% 77.50%

Table 21. Accuracy for impure text with bigram granularity and frequency values

Preprocessing technique NB LR SVM

Without preprocessing 65.19% 65.82% 65.19%
With preprocessing (removal of punctuation) 79.75% 77.22% 70.89%
12 S. Limboi, [Link]̧an

Table 22. Precision for impure text with bigram granularity and frequency values

Preprocessing technique NB LR SVM

Without preprocessing 68.35% 70.83% 69.33%
With preprocessing (removal of punctuation) 83.33% 81.57% 76.38%

Table 23. Accuracy for impure text with bigram granularity and tf-idf values

Preprocessing technique NB LR SVM

Without preprocessing 63.29% 65.82% 68.35%
With preprocessing (removal of punctuation) 80.83% 76.58% 77.22%

6 Text impur - text pur concatenat cu hashtaguri

7 Text pur + hashtaguri- textul initial

References
1. Sentiment analysis overview. [Link]
2. Tf idf feature. [Link]
3. Angiani, G., Ferrari, L., Fontanini, T., Fornacciari, P., Iotti, E., Magliani, F., Mani-
cardi, S.: A comparison between preprocessing techniques for sentiment analysis in
twitter. In: KDWeb (2016)
4. Deshmukh, R., Pawar, K.: Twitter sentiment classification on sanders data using
hybrid approach 17, 118–123 (07 2015)
5. Mitchell, R., Michalski, J., Carbonell, T.: An artificial intelligence approach.
Springer (2013)
6. Patil, P., Yalagi, P.: Sentiment analysis levels and techniques: A survey. space 1, 6
(2013)
7. Pawar, K.K., Shrishrimal, P.P., Deshmukh, R.: Twitter sentiment analysis: A review.
International Journal of Scientific & Engineering Research 6(4), 9 (2015)
8. Sankaranarayanan, J., Samet, H., Teitler, B.E., Lieberman, M.D., Sperling, J.: Twit-
terstand: News in tweets. In: Proceedings of the 17th ACM SIGSPATIAL Interna-
tional Conference on Advances in Geographic Information Systems. pp. 42–51. GIS
’09, ACM, New York, NY, USA (2009). [Link]
[Link]

Table 24. Precision for impure text with bigram granularity and tf-idf values

Preprocessing technique NB LR SVM

Without preprocessing 61.81% 68.75% 70.73%
With preprocessing (removal of punctuation) 79.12% 77.64% 81.57%
Contribution Title 13

Table 25. Accuracy for pure text with hashtags with word granularity and frequency
values

Preprocessing technique NB LR SVM

Without preprocessing 65.19% 67.09% 63.92%
With preprocessing (removal of punctuation) 79.75% 79.75% 77.85%

Table 26. Precision for pure text with hashtags with word granularity and frequency
values

Preprocessing technique NB LR SVM

Without preprocessing 71.01% 67.09% 63.92%
With preprocessing (removal of punctuation) 84.21% 85.13% 84.50%

Table 27. Accuracy for pure text with hashtags with word granularity and tf-idf values

Preprocessing technique NB LR SVM

Without preprocessing 64.56% 65.82% 69.62%
With preprocessing (removal of punctuation) 79.75% 79.75% 77.85%

Table 28. Precision for pure text with hashtags with word granularity and tf-idf values

Preprocessing technique NB LR SVM

Without preprocessing 62.50% 68.75% 71.42%
With preprocessing (removal of punctuation) 79.54% 84.21% 85.50%

Table 29. Accuracy for pure text with hashtags with bigram granularity and frequency
values

Preprocessing technique NB LR SVM

Without preprocessing 67.09% 65.82% 63.92%
With preprocessing (removal of punctuation) 78.48% 74.68% 77.85%

Table 30. Precision for pure text with hashtags with bigram granularity and frequency
values

Preprocessing technique NB LR SVM

Without preprocessing 70.51% 70.83% 68.00%
With preprocessing (removal of punctuation) 82.89% 80.55% 83.56%

Table 31. Accuracy for pure text with hashtags with bigram granularity and tf-idf
values

Preprocessing technique NB LR SVM

Without preprocessing 64.56% 66.46% 63.29%
With preprocessing (removal of punctuation) 77.22% 77.22% 77.85%
14 S. Limboi, [Link]̧an

Table 32. Precision for pure text with hashtags with bigram granularity and tf-idf
values

Preprocessing technique NB LR SVM

Without preprocessing 64.58% 70.12% 67.56%
With preprocessing (removal of punctuation) 78.26% 81.57% 82.43%

9. Zhang, L., Ghosh, R., Dekhil, M., Hsu, M., Liu, B.: Combining lexicon-based and
learning-based methods for twitter sentiment analysis (2011)

Researchpaper Twitter Sentiment Analysis A Review
No ratings yet
Researchpaper Twitter Sentiment Analysis A Review
9 pages
Social Media Sentiment Analysis Document
No ratings yet
Social Media Sentiment Analysis Document
6 pages
Sentiment Analysis of Twitter Data: A Survey of Techniques: Vishal A. Kharde S.S. Sonawane
No ratings yet
Sentiment Analysis of Twitter Data: A Survey of Techniques: Vishal A. Kharde S.S. Sonawane
11 pages
Python-Based Tweet Sentiment Analysis
No ratings yet
Python-Based Tweet Sentiment Analysis
4 pages
10 1109@icaccs48705 2020 9074208
No ratings yet
10 1109@icaccs48705 2020 9074208
3 pages
Twitter Sentiment Analysis Study
No ratings yet
Twitter Sentiment Analysis Study
15 pages
Sentiment Analysis: Approaches and Open Issues: Shahnawaz Parmanand Astya
No ratings yet
Sentiment Analysis: Approaches and Open Issues: Shahnawaz Parmanand Astya
5 pages
Mini Project Batch 6
No ratings yet
Mini Project Batch 6
14 pages
Machine Learning Sentiment Analysis
No ratings yet
Machine Learning Sentiment Analysis
5 pages
Sentiment Analysis On Political Tweets: January 2016
No ratings yet
Sentiment Analysis On Political Tweets: January 2016
4 pages
(IJIT-V6I4P8) :nikita R. Dandwate, Sarika B. Solanke
No ratings yet
(IJIT-V6I4P8) :nikita R. Dandwate, Sarika B. Solanke
5 pages
Sentiment Analysis For Promotional Campaigns: 1 Sameer Mulani 2 Nikhat Pathan
No ratings yet
Sentiment Analysis For Promotional Campaigns: 1 Sameer Mulani 2 Nikhat Pathan
3 pages
Sentiment Analysis and Opinion Mining Fo
No ratings yet
Sentiment Analysis and Opinion Mining Fo
7 pages
Twitter Sentiment Analysis Research Paper
No ratings yet
Twitter Sentiment Analysis Research Paper
5 pages
Twitter Sentiment Analysis Techniques
No ratings yet
Twitter Sentiment Analysis Techniques
5 pages
Paper 48-A Study On Sentiment Analysis Techniques
No ratings yet
Paper 48-A Study On Sentiment Analysis Techniques
14 pages
ProjectFinalReport 2copies
No ratings yet
ProjectFinalReport 2copies
26 pages
Engineering Reports - 2022 - Omuya - Sentiment Analysis On Social Media Tweets Using Dimensionality Reduction and Natural
No ratings yet
Engineering Reports - 2022 - Omuya - Sentiment Analysis On Social Media Tweets Using Dimensionality Reduction and Natural
14 pages
Preprocessing for Sentiment Analysis
No ratings yet
Preprocessing for Sentiment Analysis
4 pages
Twitter Sentiment Analysis Using Naive Bayes
No ratings yet
Twitter Sentiment Analysis Using Naive Bayes
3 pages
Sentiment Analysis On Twitter Data Using Machine Learning Algorithms in Python
No ratings yet
Sentiment Analysis On Twitter Data Using Machine Learning Algorithms in Python
15 pages
Social Media Sentiment
No ratings yet
Social Media Sentiment
8 pages
Sentiment Analysis
No ratings yet
Sentiment Analysis
4 pages
Twitter Sentimental Analysis: © APR 2021 - IRE Journals - Volume 4 Issue 10 - ISSN: 2456-8880
No ratings yet
Twitter Sentimental Analysis: © APR 2021 - IRE Journals - Volume 4 Issue 10 - ISSN: 2456-8880
5 pages
Sentiment Analysis of Tweets Using NLP
No ratings yet
Sentiment Analysis of Tweets Using NLP
3 pages
Proposalwriting
No ratings yet
Proposalwriting
16 pages
Implementation of Sentiment Analysis On Twitter Data
No ratings yet
Implementation of Sentiment Analysis On Twitter Data
6 pages
Sentimental Analysis On Twitter Data Using Naive Bayes: Ijarcce
No ratings yet
Sentimental Analysis On Twitter Data Using Naive Bayes: Ijarcce
4 pages
Twitter Sentiment Analysis Survey
No ratings yet
Twitter Sentiment Analysis Survey
7 pages
Sentiment Analysis On Data of Social Media: Aditya Zaware
No ratings yet
Sentiment Analysis On Data of Social Media: Aditya Zaware
5 pages
Product Rating Through Sentiment Analysis
No ratings yet
Product Rating Through Sentiment Analysis
23 pages
Sentiment Analysis of Twitter Data My
75% (4)
Sentiment Analysis of Twitter Data My
14 pages
Twitter Sentiment Analysis Project Report
No ratings yet
Twitter Sentiment Analysis Project Report
57 pages
Sentiment Analysis of Twitter Data: Radhi D. Desai
No ratings yet
Sentiment Analysis of Twitter Data: Radhi D. Desai
4 pages
Fin Irjmets1715854730
No ratings yet
Fin Irjmets1715854730
8 pages
Twitter Sentiment Analysis
No ratings yet
Twitter Sentiment Analysis
5 pages
IJCRT2207068
No ratings yet
IJCRT2207068
5 pages
Twitter Sentiment Analysis with Python
No ratings yet
Twitter Sentiment Analysis with Python
14 pages
Twitter Sentiment Analysis Algorithm
No ratings yet
Twitter Sentiment Analysis Algorithm
4 pages
Twitter Sentiment Analysis Using Deep Learning
No ratings yet
Twitter Sentiment Analysis Using Deep Learning
17 pages
A Review On Twitter Sentiment Analysis Approaches
No ratings yet
A Review On Twitter Sentiment Analysis Approaches
5 pages
Techniques For Sentiment Analysis of Twitter Data: A Comprehensive Survey
No ratings yet
Techniques For Sentiment Analysis of Twitter Data: A Comprehensive Survey
7 pages
Reasearch Paper
100% (1)
Reasearch Paper
9 pages
Twitter Sentiment Analysis with ML Techniques
No ratings yet
Twitter Sentiment Analysis with ML Techniques
9 pages
Digital Assignment-1 Literature Review On Twitter Sentiment Analysis Name: G.Tirumala Reg No: 16BCE0202 1)
No ratings yet
Digital Assignment-1 Literature Review On Twitter Sentiment Analysis Name: G.Tirumala Reg No: 16BCE0202 1)
9 pages
Sentiment Analysis Over Social Networks: An
No ratings yet
Sentiment Analysis Over Social Networks: An
6 pages
XGBOOST
No ratings yet
XGBOOST
5 pages
ML Paper (Namrit & Ritika)
No ratings yet
ML Paper (Namrit & Ritika)
16 pages
Sentiment of Tweets
No ratings yet
Sentiment of Tweets
7 pages
MINI
No ratings yet
MINI
9 pages
571 Document Mod
No ratings yet
571 Document Mod
30 pages
A Comprehensive Study On Lexicon Based Approaches For Sentiment Analysis
No ratings yet
A Comprehensive Study On Lexicon Based Approaches For Sentiment Analysis
7 pages
6 Project Report Sem6
No ratings yet
6 Project Report Sem6
13 pages
Minor Fnal
No ratings yet
Minor Fnal
22 pages
A Survey On Challenges and Techniques of Sentiment Analysis
No ratings yet
A Survey On Challenges and Techniques of Sentiment Analysis
6 pages
Sentiment Analysis and The Industrial Growth
No ratings yet
Sentiment Analysis and The Industrial Growth
14 pages
Twitter Sentiment Analysis with Hadoop
No ratings yet
Twitter Sentiment Analysis with Hadoop
27 pages
Twitter Sentiment Analysis Guide
No ratings yet
Twitter Sentiment Analysis Guide
27 pages
Final Exam IGCSE Paper 2 2025 (Non-Cal)
No ratings yet
Final Exam IGCSE Paper 2 2025 (Non-Cal)
15 pages
Quiz - Geometry12
No ratings yet
Quiz - Geometry12
4 pages
June 2017 QP - FP1 Edexcel
No ratings yet
June 2017 QP - FP1 Edexcel
32 pages
Transformer (Vector Group)
100% (1)
Transformer (Vector Group)
31 pages
Axial Turbine Tip Clearance Effects
No ratings yet
Axial Turbine Tip Clearance Effects
12 pages
Algebra: University of St. La Salle College of Engineering Engineering Mathematics Review
No ratings yet
Algebra: University of St. La Salle College of Engineering Engineering Mathematics Review
6 pages
100 Questions
No ratings yet
100 Questions
48 pages
Deckel FPXNC Dialog 4 Software Update Manual Rklopp
100% (1)
Deckel FPXNC Dialog 4 Software Update Manual Rklopp
116 pages
JavaScript Basics and Examples Guide
No ratings yet
JavaScript Basics and Examples Guide
40 pages
Transformer Deformation Diagnostics Overview
No ratings yet
Transformer Deformation Diagnostics Overview
63 pages
Macroeconomics 5th Edition Williamson Test Bank PDF Download
100% (10)
Macroeconomics 5th Edition Williamson Test Bank PDF Download
49 pages
100 Functional Equations Problems With S PDF
No ratings yet
100 Functional Equations Problems With S PDF
11 pages
Usman
No ratings yet
Usman
12 pages
CLB10904 Lab Report Submission Form
No ratings yet
CLB10904 Lab Report Submission Form
4 pages
Class 8th Boys Mathematics
No ratings yet
Class 8th Boys Mathematics
2 pages
Kami Export Ma'an Al Sabri Grade 10 Sem 1 Revision 2024 25 Questions
No ratings yet
Kami Export Ma'an Al Sabri Grade 10 Sem 1 Revision 2024 25 Questions
60 pages
Decision Trees
No ratings yet
Decision Trees
10 pages
Preparatory Guidance From Infosys - SP and DSE Roles
80% (5)
Preparatory Guidance From Infosys - SP and DSE Roles
9 pages
Ch8 Trigonometric Ratios
No ratings yet
Ch8 Trigonometric Ratios
14 pages
Dimensional Analysis Questions and Answers
No ratings yet
Dimensional Analysis Questions and Answers
13 pages
Unit 14: Straight Line Graphs
No ratings yet
Unit 14: Straight Line Graphs
21 pages
Calculus Integration Rules Guide
No ratings yet
Calculus Integration Rules Guide
3 pages
Grade 6 Math Mock Test Questions
No ratings yet
Grade 6 Math Mock Test Questions
4 pages
Missing Data Imputation Techniques
No ratings yet
Missing Data Imputation Techniques
23 pages
Data Structure, C, C++
100% (1)
Data Structure, C, C++
177 pages
Applications of Casey's Theorem in Geometry
No ratings yet
Applications of Casey's Theorem in Geometry
9 pages
Statics: Equilibrium and Force Analysis
No ratings yet
Statics: Equilibrium and Force Analysis
17 pages
7039
No ratings yet
7039
149 pages
Tensor Basics for Math Enthusiasts
No ratings yet
Tensor Basics for Math Enthusiasts
47 pages
Significant Figures and Calculations Guide
No ratings yet
Significant Figures and Calculations Guide
3 pages

Samplepaper

Uploaded by

Samplepaper

Uploaded by

Contribution Title?

Sergiu Limboi and Laura Dioşan

Babeş-Bolyai University, Faculty of Mathematics and Computer Science,

Keywords: Sentiment Analysis, Twitter, Hashtags

Fig. 1. Sentiment Analysis Phases

an attribute selection stage determines how data is represented in terms of rel-

Preprocessing techniques The preprocessing step is very important for the

– removal of urls, hashtags from messages.

Negation is an essential operation because negative words influence a lot the

Machine learning algorithms and performance measures For the clas-

– Support Vector Machine (SVM) is a deterministic algorithm used for finding

In terms of performance measures, accuracy, precision, recall and f-score can

4 Baseline Sentiment Analysis (BSA)

Several experiments were conducted focusing on preprocessing techniques, data

For the following methodologies, it is used Sanders dataset [4]. It is consisted of

4.2 Data preprocessing

Data preprocessing is an essential step in the Sentiment Analysis domain because

– cleaning operations: removal of punctuation, lowercasing

4.3 Data representation

4.4 Classification algorithms and evaluation measures

As classification algorithms, Naı̈ve Bayes (NB), Support Vector Machine (SVM)

4.5 Text without hashtags

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Word granularity and bigram granularity (frequency and tf-idf values)

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

5 Hashtag-based Sentiment Analysis (HSA)

5.1 Hash pur - doar hash-taguri

Word granularity- integer values (frequencies)

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

6 Text impur - text pur concatenat cu hashtaguri

7 Text pur + hashtaguri- textul initial

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

Preprocessing technique NB LR SVM

You might also like