1.
Processing and Visualizing Data
Data Processing in Social Media:
• Data Sources: Twitter, Facebook, Instagram, YouTube, Reddit, LinkedIn.
• Processing Steps:
o Data Cleaning: Remove bots, spam, emojis, HTML tags, and special characters.
o Preprocessing Text: Tokenization, stop-word removal,
stemming/lemmatization.
o Data Enrichment: Add metadata like location, language, and device info.
Data Visualization:
• Objective: Present complex social media patterns in an understandable format.
• Common Techniques:
o Time Series: Post frequency, engagement over time.
o Pie/Bar Charts: Sentiment distribution, top hashtags.
o Network Diagrams: Retweet/reply networks.
o Geo-mapping: For location-based trends.
2. Influence Maximization in Social Media
• Goal: Identify key users who can maximize the spread of content.
• Methods:
o Degree Centrality: Users with most connections.
o Betweenness Centrality: Bridge users between communities.
o PageRank: Measures node importance based on connections.
• Real-World Use: Influencer marketing, content virality strategies.
3. Link Prediction
• Purpose: Predict possible future connections (e.g., “People You May Know”).
• Techniques:
o Similarity Metrics: Jaccard similarity, Common neighbors.
o Graph-based Learning: Using GNNs or embeddings.
• Applications in Social Media: Friend recommendations, community formation
insights.
4. Collective Classification
• Definition: Predict the labels (e.g., interests, ideologies) of users based on their
neighbors in a social graph.
• Example: If most of a user’s friends support a political party, that user may be predicted
to support it too.
• Used in: Social behavior modeling, fake news detection.
5. Applications in Advertising and Game Analytics
Advertising:
• Targeted Campaigns: Audience segmentation using engagement and interests.
• Sentiment Analysis: To shape brand reputation.
• Ad Performance Monitoring: CTR, impressions, conversions.
• Social Listening: Detect consumer needs and brand mentions.
Game Analytics:
• Player Retention Metrics: Time spent, levels completed, session frequency.
• Churn Prediction: Identify when users are likely to leave.
• Monetization Analysis: In-app purchases and ad interactions.
• Player Behavior Analysis: Understand community formation in multiplayer games.
6. Collecting and Visualizing Social Media Data
• Data Collection Techniques:
o APIs: Twitter API, Instagram Graph API, YouTube Data API.
o Scraping Tools: BeautifulSoup, Scrapy (used with caution due to TOS).
• Data Types: Posts, likes, shares, comments, reactions, follower counts.
Visualization:
• Hashtag Clouds: Most frequent tags.
• Engagement Graphs: Likes/comments vs. time.
• Influence Maps: Network diagrams showing key users.
• Sentiment Maps: Sentiment by geography.
7. Visualization and Exploration
• Purpose: Discover patterns, anomalies, and insights in social media data.
• Tools:
o Python Libraries: Matplotlib, Plotly, Seaborn.
o Interactive Dashboards: Tableau, Power BI.
o Network Visualization: Gephi, Cytoscape.
• Exploratory Analysis: Use clustering, drill-down, filtering to understand user behavior
and content performance.
8. Social Network and Web Data Analytics Methods
Clickstream Analysis:
• Definition: Track the sequence of clicks and navigations users make.
• Use in Social Media: Understand how users move between content, which types of
content drive deeper engagement.
A/B Testing:
• Used For: Comparing two content strategies or UI changes.
• Example: Test two versions of a post thumbnail to see which gains more clicks.
Online Surveys:
• Purpose: Directly gather feedback from users.
• Use: Validate sentiment analysis, collect opinions for market research.
Web Crawling and Indexing:
• Crawling: Collecting public social media data.
• Indexing: Organizing data for search and retrieval.
• Applications: Trend monitoring, content discovery.
9. NLP Techniques for Micro-text Analysis
Challenges with Micro-texts (e.g., Tweets):
• Short, noisy, informal.
• Frequent use of hashtags, abbreviations, emojis.
Techniques:
• Text Preprocessing: Remove URLs, expand acronyms.
• Sentiment Analysis: Lexicon-based or ML-based.
• Topic Modeling: Latent Dirichlet Allocation (LDA).
• Named Entity Recognition (NER): Identify people, brands, places.
• Emotion Detection: Go beyond polarity to detect joy, anger, sadness, etc.
10. Trend Detection, Opinion Spread & Social Influence
Trend Detection:
• Monitor volume of mentions, retweets, hashtags over time.
• Identify viral content or sudden topic bursts.
Social Influencers & Judgements:
• Influencer Impact: Assess how influencers shape opinions or product perceptions.
• Metrics: Reach, engagement rate, sentiment impact.
Opinion Spread:
• Models:
o Independent Cascade Model
o Linear Threshold Model
• Used For: Modeling how beliefs, memes, or opinions spread in a network.
Judgement Influence:
• Analyzes how individuals change opinions due to social pressure or influencer
endorsement.
• Studied via behavioral analysis and sentiment shifts.
1. Natural Language Processing (NLP) Techniques for Micro-text Analysis
What is Micro-text?
• Short, informal pieces of text such as:
o Tweets, Facebook statuses
o Instagram captions
o YouTube comments
o Reddit threads, WhatsApp messages
Challenges in Analyzing Micro-texts:
• Limited context due to brevity
• Spelling errors, abbreviations, emojis, slang
• Multilingual or code-mixed content
• Use of hashtags, mentions, and URLs
NLP Techniques for Micro-text:
Preprocessing Steps:
• Tokenization (splitting into words)
• Lowercasing
• Removing stop words, punctuation, emojis
• Lemmatization/Stemming
• Handling hashtags and mentions intelligently
Analysis Techniques:
• Sentiment Analysis: Detecting positive, negative, or neutral tone.
o Lexicon-based (e.g., VADER, SentiWordNet)
o Machine learning-based (e.g., SVM, BERT, LSTM)
• Topic Modeling: Discovering hidden topics in a set of documents.
o LDA (Latent Dirichlet Allocation)
o NMF (Non-negative Matrix Factorization)
• Named Entity Recognition (NER):
o Identifying names, brands, locations, products.
• Text Classification:
o Classifying posts into categories like “sports,” “politics,” or “entertainment”.
• Hashtag and Emoji Analysis:
o Mapping emojis and hashtags to emotions or topics.
2. Trend Analysis in Social Media
Objective:
• Identify emerging or popular topics based on user-generated content.
Techniques:
• Frequency-based: Track word/hashtag usage over time.
• Burst Detection: Identify sudden spikes in topic mentions.
• Time-Series Analysis: Plot keyword frequencies for pattern discovery.
• Clustering: Group similar posts to reveal trends.
Applications:
• Early detection of breaking news
• Market trend monitoring
• Political sentiment tracking
3. Role of Social Influencers on Judgements
Who Are Social Influencers?
• Individuals with large or highly engaged followings.
• Can influence opinions, trends, and purchase decisions.
Influence on Judgement:
• Authority Effect: Users trust influencers’ views more.
• Bandwagon Effect: Followers may adopt influencer opinions to align socially.
• Echo Chambers: Repetition of similar viewpoints amplifies influence.
Analysis Techniques:
• Engagement Metrics: Likes, shares, retweets, comments.
• Network Analysis: Identify central/influential nodes in social graphs.
• Sentiment Shift Tracking: Observe opinion changes after influencer posts.
4. Opinion Spread in Social Media
What is Opinion Spread?
• The process by which sentiments, beliefs, or opinions propagate in a social network.
Models Used:
• Independent Cascade Model: Each influenced user has a chance to influence their
neighbors.
• Linear Threshold Model: Users adopt opinions when the influence exceeds a
threshold.
Tracking Techniques:
• Track retweets, replies, and shares over time.
• Use graph diffusion models to simulate opinion propagation.
Applications:
• Political campaigns
• Social movements (e.g., awareness campaigns)
• Viral marketing
5. Judgement in Social Media Context
What is Judgement?
• Formation of opinions, beliefs, or decisions based on content consumed or shared.
Influencing Factors:
• Peer influence (likes/comments from friends)
• Influencer posts
• Exposure to trending or viral content
• Platform algorithms (content recommendations)
Study Techniques:
• Sentiment Change Detection: Pre- and post-exposure analysis.
• Survey-based Validation: Combine data-driven insights with self-reported opinions.
• A/B Testing: Test how different content affects users’ judgements.
Summary Table:
Aspect Technique/Model Application
Sentiment analysis, NER, Topic
Micro-text NLP Brand analysis, feedback mining
Modeling
Market monitoring, event
Trend Detection Frequency analysis, burst detection
detection
Influence on Network centrality, engagement
Influencer marketing
Judgement metrics
Opinion Spread Cascade/Threshold models Viral campaigns
Social psychology, consumer
Judgement Formation Sentiment shift, A/B testing
behavior
A/B Testing:
• Used For: Comparing two content strategies or UI changes.
• Example: Test two versions of a post thumbnail to see which gains more
clicks.
• Users are randomly assigned to either version A or version B.
• A/B testing involves comparing two versions (A and B) of a piece of
content or design (like a webpage, email, or ad) are compared to see
which performs better based on specific metrics
• The performance of each version is tracked based on key metrics (e.g.,
click-through rate, conversion rate, etc.).
• The results are then analyzed to determine which version performs better.
Web Crawling and Indexing:
Crawling and indexing help your site rank in search results.
Crawling is the discovery process search engines use to find content on your
site
1. Crawling: Discovery
Collecting public social media data.
2. Indexing: Storing Information
Organizing data for search and retrieval.
Ranking :Displaying the Most Relevant Results
Applications: Trend monitoring, content discovery.
Crawling involves discovering and downloading web pages, while indexing
involves storing, analyzing, and organizing the content found during crawling.
Online Surveys:
way to collect data and gather information from people using the internet
• Purpose: Directly gather feedback from users.
• Use: Validate sentiment analysis, collect opinions for market research.
Social media channels used most often Based on 2021 Online
Survey