Twitter Sentiment Analysis Guide

- The document discusses harvesting and analyzing tweets using R. - It covers collecting tweets on a topic (Donald Trump) using the twitteR package, preparing the text for analysis, performing sentiment analysis to classify emotions and polarity, and visualizing the results. - Key steps include collecting tweets, preprocessing the text, classifying sentiment using naive Bayes classifiers, creating plots to show emotion and polarity distributions, and generating a comparison word cloud based on emotion categories.

Uploaded by

Rubila Dwi Adawiyah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

72 views23 pages

Twitter Sentiment Analysis Guide

Uploaded by

Rubila Dwi Adawiyah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 23

HARVESTING AND ANALYZING Presentation by Group 5

Natasha Christabelle (15753)

TWEETS Reifita Ayu P (15762)

Rubila Dwi (15759)
INTRODUCTION
Twitter is a fabulous source for information. Whenever
something is happening, people around the world start
tweeting away. Many twitter users also engage in
conversations, and looking at these conversations allows
us to identify leaders and frequent actors.
Harvesting tweets allow users to focus on a certain topic
or subject that requires a further understanding. So, to
harvest tweets is to basically collect a cluster of tweets
depending on how many tweets you want.
INTRODUCTION
Why? How?
Because the collecting of information is There are a lot of methods and
essential for business or personal applications to harvest and analyze
purposes. Since twitter draws 310 million tweets, those including:
users worldwide, and 79% accounts -R
come from outside the U.S, which
concludes that twitter possesses various - Python
audiences from different background. - ScraperWiki
This would come in handy for those
trying to get a grasp of where the - Mozdeh
public opinion falls upon. - BeautifulSoup
R STUDIO
What it is
R is a language and environment for
statistical computing and graphics. It is
a GNU project which is similar to the S
language and environment which was
developed at Bell Laboratories (formerly
AT&T, now Lucent Technologies) by John
Chambers and colleagues.
THE PURPOSE OF THIS PROJECT

-To harvest tweets into R

-To analyze a particular topic in twitter by doing sentiment analysis
-To visualize the most frequent words and terms contained in the tweets for the topic
we search by using wordcloud and bars
SENTIMENT ANALYSIS
Sentiment analysis, also referred to as
Opinion Mining, implies extracting
opinions, emotions and sentiments in text.
One of the most common applications of
sentiment analysis is to track attitudes
and feelings on the web, especially for
tacking products, services, brands or
even people.
The main idea is to determine whether
they are viewed positively or negatively
by a given audience.
R SENTIMENT PACKAGES BY TIMOTHY JURKA
classify_emotion
This function helps us to analyze some text and classify it in different types of
emotion: anger, disgust, fear, joy, sadness, and surprise.
classify_polarity
In contrast to the classification of emotions, the classify_polarity function allows us to
classify some text as positive or negative.
NAVE BAYER CLASSIFIER
A naive Bayes classifier applies Bayes Theorem in an attempt to suggest possible
classes for any given text. To do this, it needs a number of previously classified
documents of the same type. The theorem is as follows:
NAVE BAYER CLASSIFIER MODIFICATION
In its application, NaveBayesClassifier often not only used just like that, but still need some
modifications to improve the performance of the algorithm itself. Some modifications that can be
done is:
1. Prepocessing, it is performed for the early stages of starting a
sentiment analysis process. In this preprocessing, there are several stages that must be
undertaken, namely:
a. Changing the status of the entire text to lowercase (lowercase).
b. Delete url contained in text status (http: //www....com).
c. Removing the tag (@) with username
d. Delete a hashtag (#).
e. Changing repeating letters, for example 'hunggrryyy'
or 'huuuungry' becomes 'hungry'.f. Removing punctuation like a comma, single / double quote,
question marks contained in the status text, for example, beautiful !!!!!
replaced by beautiful.
g. Words must start with an alphabet - For simplicity sake, we can remove all those words
which don't start with an alphabet. E.g. 15th, 5.34am
NAVE BAYER CLASSIFIER MODIFICATION
2. Stopword removal is a removal process words that are sentiment and can be
removed. Examples of stopword for English
is is, a, all. For Indonesian, such as the names of the months, pronouns,
and conjunctions.
3. N-Grams is a process that is done to deal with the negative word like (not, is not).
In addition, N-Grams are also used to handle the appearance of the phrase in a
text status. In sentiment analysis, N grams commonly used is Bigram (two-word
combination).
For example, if there is a text "Serviceisbad", it will be tokenized with unigram
and Bigram into Service, is, bad, Serviceis, isbad
STEP 1: LOAD PACKAGES
# required pakacgeslibrary(twitteR)
library(sentiment)
library(plyr)
library(ggplot2)
library(wordcloud)
library(RColorBrewer)
STEP 2: AUTHORIZATION
options(RCurlOptions = list(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl")))
reqURL <- "https://api.twitter.com/oauth/request_token"
accessURL <- "https://api.twitter.com/oauth/access_token"
authURL <- "http://api.twitter.com/oauth/authorize"

apiKey <- "zCtaknzj3oVJplxCoN4X8uj2B"

apiSecret <- "5B6C1vYXehHvnffXZoRZEisoWXPoNJCyW7f2vSgga76OxkCdkJ"
access_token <- "176005517-Lv1Y6sT0MT0lYDbmlNxH9eY0iRkdTyzwVQbGfreH"
access_token_secret <-"71L9IhcEasZKabmCEHdJuF2yulS9rnVixq0H62HW8urgC"
setup_twitter_oauth(apiKey,apiSecret,access_token,
access_token_secret)
STEP 3: COLLECT SOME TWEETS CONTAINING THE
TERM DONALD TRUMP"
# harvest some tweets
some_tweets = searchTwitter(donald trump", n=1000, lang=eng")

# get the text

some_txt = sapply(some_tweets, function(x) x$getText())
STEP 4: PREPARE THE TEXT FOR SENTIMENT
ANALYSIS
# remove retweet entities # remove unnecessary spaces # if not an error
some_txt = some_txt = gsub("[ \t]{2,}", "", if (!inherits(try_error, "error"))
gsub("(RT|via)((?:\\b\\W*@\\w+)+ some_txt)
)", "", some_txt) y = tolower(x)
some_txt = gsub("^\\s+|\\s+$", "",
# remove at people some_txt) # result

some_txt = gsub("@\\w+", "", return(y)

some_txt)
}
# remove punctuation # define "tolower error handling"
function # lower case using try.error with
some_txt = gsub("[[:punct:]]", "", sapply
some_txt) try.error = function(x)
some_txt = sapply(some_txt, try.error)
# remove numbers {
some_txt = gsub("[[:digit:]]", "", # create missing value
some_txt)
y = NA # remove NAs in some_txt
# remove html links
# tryCatch error some_txt = some_txt[!is.na(some_txt)]
some_txt = gsub("http\\w+", "",
some_txt) try_error = tryCatch(tolower(x), names(some_txt) = NULL
error=function(e) e)
STEP 5: PERFORM SENTIMENT ANALYSIS"
# classify emotion
class_emo = classify_emotion(some_txt, algorithm="bayes", prior=1.0)
# get emotion best fit
emotion = class_emo[,7]
# substitute NA's by "unknown"
emotion[is.na(emotion)] = "unknown"

# classify polarity
class_pol = classify_polarity(some_txt, algorithm="bayes")
# get polarity best fit
polarity = class_pol[,4]
STEP 6: CREATE DATA FRAME WITH THE RESULTS
AND OBTAIN SOME GENERAL STATISTICS
# data frame with results
sent_df = data.frame(text=some_txt, emotion=emotion,
polarity=polarity, stringsAsFactors=FALSE)

# sort data frame

sent_df = within(sent_df,
emotion <- factor(emotion, levels=names(sort(table(emotion), decreasing=TRUE))))
STEP 7: CREATE PLOT DISTRIBUTION OF
EMOTION
# plot distribution of emotions
ggplot(sent_df, aes(x=emotion)) +
geom_bar(aes(y=..count.., fill=emotion)) +
scale_fill_brewer(palette="Dark2") +
labs(x="emotion categories", y="number of tweets") +
ggtitle("Sentiment Analysis of Tweets about Donald Trump\n(classification by
emotion")
STEP 7: CREATE PLOT DISTRIBUTION OF
POLARITY
# plot distribution of polarity
ggplot(sent_df, aes(x=polarity)) +
geom_bar(aes(y=..count.., fill=polarity)) +
scale_fill_brewer(palette="RdGy") +
labs(x="polarity categories", y="number of tweets") +
ggtitle("Sentiment Analysis of Tweets about Donald Trump\n(classification by
polarity")
STEP 8: CREATE COMPARISON CLOUD BASED ON
EMOTION
# separating text by emotion } TermDocumentMatrix(corpus)
emos = tdm = as.matrix(tdm)
levels(factor(sent_df$emotion))
# remove stopwords colnames(tdm) = emos
nemo = length(emos)
emo.docs =
emo.docs = rep("", nemo) removeWords(emo.docs,
stopwords("english")) # comparison word cloud
for (i in 1:nemo)
# create corpus comparison.cloud(tdm, colors =
{ brewer.pal(nemo, "Dark2"),
corpus =
tmp = some_txt[emotion == Corpus(VectorSource(emo.docs scale = c(3,0.5),
emos[i]] )) random.order = FALSE,
title.size = 1.5)
emo.docs[i] = paste(tmp, tdm =
collapse=" ") warnings()
THANK YOU FOR YOUR ATTENTION

R Sentiment Analysis for Twitter
No ratings yet
R Sentiment Analysis for Twitter
10 pages
Polarity Identification Through Emoticon Using Context Based Sentiment Analysis - 1605073640
No ratings yet
Polarity Identification Through Emoticon Using Context Based Sentiment Analysis - 1605073640
5 pages
Sentiment Analysis of Twitter Data My
75% (4)
Sentiment Analysis of Twitter Data My
14 pages
R Text Mining & Sentiment Guide
No ratings yet
R Text Mining & Sentiment Guide
9 pages
Twitter Sentiment Analysis Overview
No ratings yet
Twitter Sentiment Analysis Overview
26 pages
Sample 1
No ratings yet
Sample 1
22 pages
Text Analytics and Sentiment Analysis Guide
No ratings yet
Text Analytics and Sentiment Analysis Guide
10 pages
Sentiment Analysis On User-Generated Tweets
No ratings yet
Sentiment Analysis On User-Generated Tweets
15 pages
Sentiment Analysis On Twitter Data Using Machine Learning Algorithms in Python
No ratings yet
Sentiment Analysis On Twitter Data Using Machine Learning Algorithms in Python
15 pages
Implementation of Sentiment Analysis On Twitter Data
No ratings yet
Implementation of Sentiment Analysis On Twitter Data
6 pages
Fds Casestudy Chan
No ratings yet
Fds Casestudy Chan
9 pages
Twitter Sentiment Analysis
100% (2)
Twitter Sentiment Analysis
10 pages
Analisis de Datos de Sentimeinos en R
No ratings yet
Analisis de Datos de Sentimeinos en R
18 pages
Twitter Sentiment Analysis Guide
No ratings yet
Twitter Sentiment Analysis Guide
3 pages
Sentiment Analysis for Data Scientists
No ratings yet
Sentiment Analysis for Data Scientists
22 pages
Machine Learning With Advance Model
No ratings yet
Machine Learning With Advance Model
19 pages
Python-Based Tweet Sentiment Analysis
No ratings yet
Python-Based Tweet Sentiment Analysis
4 pages
Design Review
No ratings yet
Design Review
16 pages
Pre Processing
No ratings yet
Pre Processing
9 pages
Python Twitter Sentiment Guide
No ratings yet
Python Twitter Sentiment Guide
21 pages
Twitter Sentiment Analysis Guide
No ratings yet
Twitter Sentiment Analysis Guide
27 pages
Twitter Sentiment Analysis with Hadoop
No ratings yet
Twitter Sentiment Analysis with Hadoop
27 pages
Sentiment Analysis of Twitter Data: Radhi D. Desai
No ratings yet
Sentiment Analysis of Twitter Data: Radhi D. Desai
4 pages
Twitter Sentiment Analysis
No ratings yet
Twitter Sentiment Analysis
5 pages
Twitter Sentiment Analysis Techniques
No ratings yet
Twitter Sentiment Analysis Techniques
9 pages
Business Sentiment Analysis Guide
No ratings yet
Business Sentiment Analysis Guide
6 pages
Product Rating Through Sentiment Analysis
No ratings yet
Product Rating Through Sentiment Analysis
23 pages
A Review On Sentiment Analysis and Emotion Detection From Text
No ratings yet
A Review On Sentiment Analysis and Emotion Detection From Text
4 pages
Textual Analysis Sentiment Analysis Presentation
No ratings yet
Textual Analysis Sentiment Analysis Presentation
15 pages
Twitter Sentiment Analysis with Python
No ratings yet
Twitter Sentiment Analysis with Python
14 pages
10 1109@icaccs48705 2020 9074208
No ratings yet
10 1109@icaccs48705 2020 9074208
3 pages
Mini Project Batch 6
No ratings yet
Mini Project Batch 6
14 pages
PPPT
No ratings yet
PPPT
20 pages
FML Project Report
No ratings yet
FML Project Report
18 pages
Review and Analysis of Emotion Detection From Tweets Using Twitter Datasets
No ratings yet
Review and Analysis of Emotion Detection From Tweets Using Twitter Datasets
9 pages
Micro-Blogging Sentimental Analysis On Twitter Data Using Naïve Bayes
No ratings yet
Micro-Blogging Sentimental Analysis On Twitter Data Using Naïve Bayes
7 pages
Twitter Sentiment Analysis Project Report
No ratings yet
Twitter Sentiment Analysis Project Report
17 pages
Twitter Sentiment Analysis Project
No ratings yet
Twitter Sentiment Analysis Project
18 pages
Twitter Sentiment Analysis Overview
No ratings yet
Twitter Sentiment Analysis Overview
17 pages
IR Case Study Final Presentation
No ratings yet
IR Case Study Final Presentation
12 pages
A Natural Language Processing For Sentiment Analysis From Text Using Deep Learning Algorithm
No ratings yet
A Natural Language Processing For Sentiment Analysis From Text Using Deep Learning Algorithm
7 pages
IC-RTETM Final Sentiment Analysis
No ratings yet
IC-RTETM Final Sentiment Analysis
13 pages
Fin Ijprems1714118825
No ratings yet
Fin Ijprems1714118825
6 pages
A Review On Twitter Sentiment Analysis Approaches
No ratings yet
A Review On Twitter Sentiment Analysis Approaches
5 pages
Twitter Sentiment Analysis Using Naive Bayes
No ratings yet
Twitter Sentiment Analysis Using Naive Bayes
3 pages
Digital Assignment-1 Literature Review On Twitter Sentiment Analysis Name: G.Tirumala Reg No: 16BCE0202 1)
No ratings yet
Digital Assignment-1 Literature Review On Twitter Sentiment Analysis Name: G.Tirumala Reg No: 16BCE0202 1)
9 pages
Tweet Analysis Ieee Format
No ratings yet
Tweet Analysis Ieee Format
5 pages
Sentiment Analysis Using Bert Model
No ratings yet
Sentiment Analysis Using Bert Model
8 pages
Twitter Sentiment Analysis Model
No ratings yet
Twitter Sentiment Analysis Model
2 pages
Twitter Hate Speech Detection Guide
No ratings yet
Twitter Hate Speech Detection Guide
6 pages
Sentiment Analysis Using Machine Learning Algorithms
No ratings yet
Sentiment Analysis Using Machine Learning Algorithms
23 pages
Emotion Detection Analysis Documenration
No ratings yet
Emotion Detection Analysis Documenration
37 pages
### Seminar Report
No ratings yet
### Seminar Report
12 pages
Anne Hennessy
No ratings yet
Anne Hennessy
61 pages
Twitter Sentiment Analysis Guide
No ratings yet
Twitter Sentiment Analysis Guide
4 pages
Method For The Analysis of Sentiments in Social Networks With The Use of R
No ratings yet
Method For The Analysis of Sentiments in Social Networks With The Use of R
16 pages
Twitter Sentiment Analysis For Product Review
No ratings yet
Twitter Sentiment Analysis For Product Review
19 pages
Sentimental Analysis On Twitter Data Using Naive Bayes: Ijarcce
No ratings yet
Sentimental Analysis On Twitter Data Using Naive Bayes: Ijarcce
4 pages
Solr vs Elasticsearch: Key Features
No ratings yet
Solr vs Elasticsearch: Key Features
10 pages
Open-Source Search Engines Overview
No ratings yet
Open-Source Search Engines Overview
52 pages
Indiana University Data Privacy Breach Case
No ratings yet
Indiana University Data Privacy Breach Case
2 pages
Chaos Theory in Cryptography Explained
No ratings yet
Chaos Theory in Cryptography Explained
17 pages
The Crying Beach: A Tale of Loss
No ratings yet
The Crying Beach: A Tale of Loss
2 pages
Programming Concept-Sebesta
No ratings yet
Programming Concept-Sebesta
31 pages
The Crying Beach: A Tale of Loss
No ratings yet
The Crying Beach: A Tale of Loss
2 pages
Home Work
No ratings yet
Home Work
3 pages
Selenium Capgemini
No ratings yet
Selenium Capgemini
7 pages
G 12 Module II Unit Two
No ratings yet
G 12 Module II Unit Two
10 pages
IMX8MPISPISIAPI
No ratings yet
IMX8MPISPISIAPI
38 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
23 pages
SAP PM Calibration Process Manual
100% (2)
SAP PM Calibration Process Manual
41 pages
How To Configure Windows 10 Privacy Settings During Setup
No ratings yet
How To Configure Windows 10 Privacy Settings During Setup
9 pages
JavaScript Interview Questions and Answers PDF - CodeProject
No ratings yet
JavaScript Interview Questions and Answers PDF - CodeProject
8 pages
Latest Tech Quiz Questions and Answers 2018
100% (1)
Latest Tech Quiz Questions and Answers 2018
5 pages
NoSQL Data Management Overview
No ratings yet
NoSQL Data Management Overview
36 pages
The Effectiveness of Internet On Study Habits of Grade 12 Tvl-Ict Student in Ict
No ratings yet
The Effectiveness of Internet On Study Habits of Grade 12 Tvl-Ict Student in Ict
10 pages
Compiler Construction Lab Manual
No ratings yet
Compiler Construction Lab Manual
33 pages
Black Box Testing
No ratings yet
Black Box Testing
14 pages
Understanding Programmable Logic Controllers
No ratings yet
Understanding Programmable Logic Controllers
75 pages
OpenSAP Btpt1 Week 1 Transcript en
No ratings yet
OpenSAP Btpt1 Week 1 Transcript en
23 pages
User ManualHXE34 Indonesia
50% (2)
User ManualHXE34 Indonesia
26 pages
ANSYS Forte Quick Start Guide 2019 R2
No ratings yet
ANSYS Forte Quick Start Guide 2019 R2
34 pages
Tax Appeal Communication Acknowledgement
No ratings yet
Tax Appeal Communication Acknowledgement
2 pages
Knowledge Navigator: HCI Lab Insights
No ratings yet
Knowledge Navigator: HCI Lab Insights
44 pages
DBMS Fundamentals and Query Processing
No ratings yet
DBMS Fundamentals and Query Processing
31 pages
Chạy 2 Ứng Dụng Trên Android
No ratings yet
Chạy 2 Ứng Dụng Trên Android
26 pages
Juniper SRX300 Firewall Datasheet
No ratings yet
Juniper SRX300 Firewall Datasheet
11 pages
Web Development 2A Supp Exam
No ratings yet
Web Development 2A Supp Exam
7 pages
Jolas - Toi
No ratings yet
Jolas - Toi
10 pages
Genetic Algorithms Explained: Key Concepts
100% (1)
Genetic Algorithms Explained: Key Concepts
7 pages
SAP CATS: Time Recording Overview
100% (3)
SAP CATS: Time Recording Overview
89 pages
Data Warehouse Scheme and Syllabus
No ratings yet
Data Warehouse Scheme and Syllabus
2 pages
Software Developer CV
No ratings yet
Software Developer CV
3 pages
TS Fault Sync Ericsson PDF
No ratings yet
TS Fault Sync Ericsson PDF
5 pages
Mini Project
No ratings yet
Mini Project
15 pages

Twitter Sentiment Analysis Guide

Uploaded by

Twitter Sentiment Analysis Guide

Uploaded by

HARVESTING AND ANALYZING Presentation by Group 5

Natasha Christabelle (15753)

TWEETS Reifita Ayu P (15762)

-To harvest tweets into R

apiKey <- "zCtaknzj3oVJplxCoN4X8uj2B"

# get the text

some_txt = gsub("@\\w+", "", return(y)

# sort data frame

You might also like