0% found this document useful (0 votes)
6 views12 pages

Unit-5 Notes BCAM 061

The document covers various aspects of data processing and analysis in social media, including data collection, visualization techniques, and methods for influence maximization and link prediction. It discusses the application of NLP techniques for analyzing micro-texts, trend detection, and the role of social influencers in shaping opinions. Additionally, it highlights the importance of A/B testing, web crawling, and online surveys for gathering insights and improving content strategies.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views12 pages

Unit-5 Notes BCAM 061

The document covers various aspects of data processing and analysis in social media, including data collection, visualization techniques, and methods for influence maximization and link prediction. It discusses the application of NLP techniques for analyzing micro-texts, trend detection, and the role of social influencers in shaping opinions. Additionally, it highlights the importance of A/B testing, web crawling, and online surveys for gathering insights and improving content strategies.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

1.

Processing and Visualizing Data

Data Processing in Social Media:

• Data Sources: Twitter, Facebook, Instagram, YouTube, Reddit, LinkedIn.

• Processing Steps:

o Data Cleaning: Remove bots, spam, emojis, HTML tags, and special characters.

o Preprocessing Text: Tokenization, stop-word removal,


stemming/lemmatization.

o Data Enrichment: Add metadata like location, language, and device info.

Data Visualization:

• Objective: Present complex social media patterns in an understandable format.

• Common Techniques:

o Time Series: Post frequency, engagement over time.

o Pie/Bar Charts: Sentiment distribution, top hashtags.

o Network Diagrams: Retweet/reply networks.

o Geo-mapping: For location-based trends.

2. Influence Maximization in Social Media

• Goal: Identify key users who can maximize the spread of content.

• Methods:

o Degree Centrality: Users with most connections.

o Betweenness Centrality: Bridge users between communities.

o PageRank: Measures node importance based on connections.

• Real-World Use: Influencer marketing, content virality strategies.

3. Link Prediction

• Purpose: Predict possible future connections (e.g., “People You May Know”).

• Techniques:

o Similarity Metrics: Jaccard similarity, Common neighbors.

o Graph-based Learning: Using GNNs or embeddings.

• Applications in Social Media: Friend recommendations, community formation


insights.
4. Collective Classification

• Definition: Predict the labels (e.g., interests, ideologies) of users based on their
neighbors in a social graph.

• Example: If most of a user’s friends support a political party, that user may be predicted
to support it too.

• Used in: Social behavior modeling, fake news detection.

5. Applications in Advertising and Game Analytics

Advertising:

• Targeted Campaigns: Audience segmentation using engagement and interests.

• Sentiment Analysis: To shape brand reputation.

• Ad Performance Monitoring: CTR, impressions, conversions.

• Social Listening: Detect consumer needs and brand mentions.

Game Analytics:

• Player Retention Metrics: Time spent, levels completed, session frequency.

• Churn Prediction: Identify when users are likely to leave.

• Monetization Analysis: In-app purchases and ad interactions.

• Player Behavior Analysis: Understand community formation in multiplayer games.

6. Collecting and Visualizing Social Media Data

• Data Collection Techniques:

o APIs: Twitter API, Instagram Graph API, YouTube Data API.

o Scraping Tools: BeautifulSoup, Scrapy (used with caution due to TOS).

• Data Types: Posts, likes, shares, comments, reactions, follower counts.

Visualization:

• Hashtag Clouds: Most frequent tags.

• Engagement Graphs: Likes/comments vs. time.

• Influence Maps: Network diagrams showing key users.

• Sentiment Maps: Sentiment by geography.

7. Visualization and Exploration

• Purpose: Discover patterns, anomalies, and insights in social media data.


• Tools:

o Python Libraries: Matplotlib, Plotly, Seaborn.

o Interactive Dashboards: Tableau, Power BI.

o Network Visualization: Gephi, Cytoscape.

• Exploratory Analysis: Use clustering, drill-down, filtering to understand user behavior


and content performance.

8. Social Network and Web Data Analytics Methods

Clickstream Analysis:

• Definition: Track the sequence of clicks and navigations users make.

• Use in Social Media: Understand how users move between content, which types of
content drive deeper engagement.

A/B Testing:

• Used For: Comparing two content strategies or UI changes.

• Example: Test two versions of a post thumbnail to see which gains more clicks.

Online Surveys:

• Purpose: Directly gather feedback from users.

• Use: Validate sentiment analysis, collect opinions for market research.

Web Crawling and Indexing:

• Crawling: Collecting public social media data.

• Indexing: Organizing data for search and retrieval.

• Applications: Trend monitoring, content discovery.

9. NLP Techniques for Micro-text Analysis

Challenges with Micro-texts (e.g., Tweets):

• Short, noisy, informal.

• Frequent use of hashtags, abbreviations, emojis.

Techniques:

• Text Preprocessing: Remove URLs, expand acronyms.

• Sentiment Analysis: Lexicon-based or ML-based.

• Topic Modeling: Latent Dirichlet Allocation (LDA).

• Named Entity Recognition (NER): Identify people, brands, places.


• Emotion Detection: Go beyond polarity to detect joy, anger, sadness, etc.

10. Trend Detection, Opinion Spread & Social Influence

Trend Detection:

• Monitor volume of mentions, retweets, hashtags over time.

• Identify viral content or sudden topic bursts.

Social Influencers & Judgements:

• Influencer Impact: Assess how influencers shape opinions or product perceptions.

• Metrics: Reach, engagement rate, sentiment impact.

Opinion Spread:

• Models:

o Independent Cascade Model

o Linear Threshold Model

• Used For: Modeling how beliefs, memes, or opinions spread in a network.

Judgement Influence:

• Analyzes how individuals change opinions due to social pressure or influencer


endorsement.

• Studied via behavioral analysis and sentiment shifts.

1. Natural Language Processing (NLP) Techniques for Micro-text Analysis

What is Micro-text?

• Short, informal pieces of text such as:

o Tweets, Facebook statuses

o Instagram captions

o YouTube comments

o Reddit threads, WhatsApp messages

Challenges in Analyzing Micro-texts:

• Limited context due to brevity

• Spelling errors, abbreviations, emojis, slang

• Multilingual or code-mixed content

• Use of hashtags, mentions, and URLs


NLP Techniques for Micro-text:

Preprocessing Steps:

• Tokenization (splitting into words)

• Lowercasing

• Removing stop words, punctuation, emojis

• Lemmatization/Stemming

• Handling hashtags and mentions intelligently

Analysis Techniques:

• Sentiment Analysis: Detecting positive, negative, or neutral tone.

o Lexicon-based (e.g., VADER, SentiWordNet)

o Machine learning-based (e.g., SVM, BERT, LSTM)

• Topic Modeling: Discovering hidden topics in a set of documents.

o LDA (Latent Dirichlet Allocation)

o NMF (Non-negative Matrix Factorization)

• Named Entity Recognition (NER):

o Identifying names, brands, locations, products.

• Text Classification:

o Classifying posts into categories like “sports,” “politics,” or “entertainment”.

• Hashtag and Emoji Analysis:

o Mapping emojis and hashtags to emotions or topics.

2. Trend Analysis in Social Media

Objective:

• Identify emerging or popular topics based on user-generated content.

Techniques:

• Frequency-based: Track word/hashtag usage over time.

• Burst Detection: Identify sudden spikes in topic mentions.

• Time-Series Analysis: Plot keyword frequencies for pattern discovery.

• Clustering: Group similar posts to reveal trends.

Applications:

• Early detection of breaking news


• Market trend monitoring

• Political sentiment tracking

3. Role of Social Influencers on Judgements

Who Are Social Influencers?

• Individuals with large or highly engaged followings.

• Can influence opinions, trends, and purchase decisions.

Influence on Judgement:

• Authority Effect: Users trust influencers’ views more.

• Bandwagon Effect: Followers may adopt influencer opinions to align socially.

• Echo Chambers: Repetition of similar viewpoints amplifies influence.

Analysis Techniques:

• Engagement Metrics: Likes, shares, retweets, comments.

• Network Analysis: Identify central/influential nodes in social graphs.

• Sentiment Shift Tracking: Observe opinion changes after influencer posts.

4. Opinion Spread in Social Media

What is Opinion Spread?

• The process by which sentiments, beliefs, or opinions propagate in a social network.

Models Used:

• Independent Cascade Model: Each influenced user has a chance to influence their
neighbors.

• Linear Threshold Model: Users adopt opinions when the influence exceeds a
threshold.

Tracking Techniques:

• Track retweets, replies, and shares over time.

• Use graph diffusion models to simulate opinion propagation.

Applications:

• Political campaigns

• Social movements (e.g., awareness campaigns)

• Viral marketing
5. Judgement in Social Media Context

What is Judgement?

• Formation of opinions, beliefs, or decisions based on content consumed or shared.

Influencing Factors:

• Peer influence (likes/comments from friends)

• Influencer posts

• Exposure to trending or viral content

• Platform algorithms (content recommendations)

Study Techniques:

• Sentiment Change Detection: Pre- and post-exposure analysis.

• Survey-based Validation: Combine data-driven insights with self-reported opinions.

• A/B Testing: Test how different content affects users’ judgements.

Summary Table:

Aspect Technique/Model Application

Sentiment analysis, NER, Topic


Micro-text NLP Brand analysis, feedback mining
Modeling

Market monitoring, event


Trend Detection Frequency analysis, burst detection
detection

Influence on Network centrality, engagement


Influencer marketing
Judgement metrics

Opinion Spread Cascade/Threshold models Viral campaigns

Social psychology, consumer


Judgement Formation Sentiment shift, A/B testing
behavior
A/B Testing:
• Used For: Comparing two content strategies or UI changes.
• Example: Test two versions of a post thumbnail to see which gains more
clicks.
• Users are randomly assigned to either version A or version B.
• A/B testing involves comparing two versions (A and B) of a piece of
content or design (like a webpage, email, or ad) are compared to see
which performs better based on specific metrics
• The performance of each version is tracked based on key metrics (e.g.,
click-through rate, conversion rate, etc.).
• The results are then analyzed to determine which version performs better.
Web Crawling and Indexing:
Crawling and indexing help your site rank in search results.
Crawling is the discovery process search engines use to find content on your
site
1. Crawling: Discovery
Collecting public social media data.
2. Indexing: Storing Information
Organizing data for search and retrieval.
Ranking :Displaying the Most Relevant Results
Applications: Trend monitoring, content discovery.

Crawling involves discovering and downloading web pages, while indexing


involves storing, analyzing, and organizing the content found during crawling.
Online Surveys:
way to collect data and gather information from people using the internet
• Purpose: Directly gather feedback from users.
• Use: Validate sentiment analysis, collect opinions for market research.

Social media channels used most often Based on 2021 Online


Survey

You might also like