0% found this document useful (0 votes)
10 views4 pages

Lab 15 Assignment by Ankit

The document outlines a lab assignment for sentiment analysis of tweets using Python. It includes steps for data collection, preprocessing, sentiment analysis with VADER, word frequency analysis, visualization, and building a web app using Streamlit. Additionally, it provides options for deployment on various platforms like Streamlit Community Cloud and Heroku.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views4 pages

Lab 15 Assignment by Ankit

The document outlines a lab assignment for sentiment analysis of tweets using Python. It includes steps for data collection, preprocessing, sentiment analysis with VADER, word frequency analysis, visualization, and building a web app using Streamlit. Additionally, it provides options for deployment on various platforms like Streamlit Community Cloud and Heroku.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Lab 15 Assignment

By: Ankit Singh

1. Data Collection

• Option 1: Use Twitter API (via Tweepy)


• Option 2: Use a provided dataset (CSV format)

python
CopyEdit
import pandas as pd
# If using a dataset
df = pd.read_csv('tweets_game.csv') # Example dataset

2. Preprocess Tweets

• Remove punctuation, URLs, convert to lowercase, remove stopwords.

python
CopyEdit
import re
import nltk
from nltk.corpus import stopwords
nltk.download('stopwords')

def clean_tweet(tweet):
tweet = re.sub(r"http\S+|www\S+|https\S+", '', tweet,
flags=re.MULTILINE)
tweet = re.sub(r'\@w+|\#','', tweet)
tweet = re.sub(r'[^A-Za-z\s]', '', tweet.lower())
tokens = tweet.split()
tokens = [word for word in tokens if word not in
stopwords.words('english')]
return ' '.join(tokens)

df['cleaned'] = df['text'].apply(clean_tweet)
3. Sentiment Analysis

Use VADER from NLTK.

python
CopyEdit
from nltk.sentiment.vader import SentimentIntensityAnalyzer
nltk.download('vader_lexicon')

sia = SentimentIntensityAnalyzer()
df['sentiment_score'] = df['cleaned'].apply(lambda x:
sia.polarity_scores(x)['compound'])

def classify_sentiment(score):
if score >= 0.05:
return 'Positive'
elif score <= -0.05:
return 'Negative'
else:
return 'Neutral'

df['sentiment'] = df['sentiment_score'].apply(classify_sentiment)

4. Word Frequency Analysis

Find most common words in positive and negative tweets.

python
CopyEdit
from collections import Counter

positive_words = ' '.join(df[df['sentiment'] ==


'Positive']['cleaned']).split()
negative_words = ' '.join(df[df['sentiment'] ==
'Negative']['cleaned']).split()
positive_freq = Counter(positive_words).most_common(10)
negative_freq = Counter(negative_words).most_common(10)

5. Visualization

Using Matplotlib, Seaborn, or WordCloud.

python
CopyEdit
import seaborn as sns
import matplotlib.pyplot as plt
from wordcloud import WordCloud

# Sentiment Distribution
sns.countplot(x='sentiment', data=df)
plt.title("Sentiment Distribution")
plt.show()

# Word Clouds
WordCloud(width=800, height=400).generate('
'.join(positive_words)).to_image()
WordCloud(width=800, height=400).generate('
'.join(negative_words)).to_image()

6. Build Web App using Streamlit

python
CopyEdit
import streamlit as st

st.title("Game Tweet Sentiment Analysis")

st.write("### Sentiment Distribution")


st.bar_chart(df['sentiment'].value_counts())
st.write("### Most Common Positive Words")
st.write(pd.DataFrame(positive_freq, columns=['Word', 'Count']))

st.write("### Most Common Negative Words")


st.write(pd.DataFrame(negative_freq, columns=['Word', 'Count']))

To run the app:

bash
CopyEdit
streamlit run app.py

7. Deployment

You can deploy on:

• Streamlit Community Cloud (free)


• Render / Heroku (for Flask apps)

You might also like