0% found this document useful (0 votes)
40 views31 pages

Report

The document outlines a minor project report on the development of a WhatsApp Chat Analyser, aimed at extracting and visualizing data from exported chat files using Python. It discusses the significance of understanding chat patterns and sentiments in the context of WhatsApp's popularity, and details the system's architecture, tools, and technologies used. The project is designed for researchers and users interested in analyzing communication behaviors and trends within WhatsApp chats.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views31 pages

Report

The document outlines a minor project report on the development of a WhatsApp Chat Analyser, aimed at extracting and visualizing data from exported chat files using Python. It discusses the significance of understanding chat patterns and sentiments in the context of WhatsApp's popularity, and details the system's architecture, tools, and technologies used. The project is designed for researchers and users interested in analyzing communication behaviors and trends within WhatsApp chats.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 31

WHATS APP CHATS ANALYSER

A Minor Project report submit ed in Part al fulfilment, the requirement for the
Degree
Of
Bachelor of Technology
Submit ed By
NAME
Under the Guidance of
Prof.
(Asst. Prof., Department of )
For the Session (20-20)
DEPARTMENT OF COMPUTER SCIENCE AND
ENGINEERING GANDHI INSTITUTE FOR TECHNOLOGY
(GIFT)
Affiliated To
BIJU PATNAIK UNIVERSITY OF TECHNOLOGY, ODISHA
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
GANDHI INSTITUTE FOR TECHNOLOGY (GIFT)
i
CERTIFICATE
This is to cert fy that [NAME HERE] undertook this Project ent i led "WHATS
APP ANALYSER". We have completed the project for the part i l fulfillment
of the requirement for the degree of Bachelor in COMPUTER SCIENCE AND
ENGINEERING from Centre for Under Graduate Studies, GIFT, Bhubaneswar.
(Asst. Prof., Department of CSE)
Project Mentor
(Asst. Prof., Department of CSE)
Project Co-Ordinator
Affiliated To BIJU PATNAIK UNIVERSITY OF TECHNOLOGY, ODISHA
GANDHI INSTITUTE FOR TECHNOLOGY (GIFT)
GANGAPADA P O, BHUBANESWAR
ii
ACKNOWLEDGEMENT
We would like to express my deepest appreciat i onto Prof. Debasis
Acharya for providing us support and valuable suggest i onsduring the
t i e of seminar preparat i on.We would like to thank BPUT University and
GIFT (Gandhi Inst tute For Technology) for providing us this golden
opportunity to enhance our knowledge in the project on WHATS APP
CHAT ANALYSER. At the end, we would like to express our sincere thanks
to all the friends and others who helped us directly or indirectly during
this project work.
NAME
REG.NO
BRANCH
iii
ABSTRACT
With the growing popularity of WhatsApp as a primary communicat i ontool,
understanding
chat pat erns and sent i ents has gained significance. This project develops a
WhatsApp
Chat Analyser using Python to extract and visualize informat on from exported
chat files.
The system generates insights such as most act i e part i ipants, chat frequency,
commonly
used words, and sent i ent trends. The results are presented with visual aids for
easy
interpretat on. This tool is aimed at researchers and users interested in analyzing
personal
or group chats to derive meaningful pat erns and behaviors.
1
CONTENT
CHAPTER NO.
TITLE
PAGE.NO
1.
Introduct on
1.1 Literature Review
1.2 Proposed System Architecture
2.

3.
System Overview
Tools And Technologies Used
3.1. Programming Language
3.2. Libraries and Frameworks
3.3. Tools and Plat orms
3.4. Data Source
3.5. Version Control
4.

5.
Data Collect on & Preprocessing
4.1 Data Collect on
4.2 Preprocessing
4.3 Tools and Libraries Used
Data Analysis and Visualizat on
5.1 Data Preprocessing
5.2 Key Metrics Calculated
5.3 User Act i viy Analysis
5.4. Time-based Act i viy Trends
5.5 Message Content Analysis
5.6 Tools & Libraries Used
6.
Results and Finding
6.1 General Chat Stat st cs
6.2. Message Frequency
6.3. Media Sharing
6.4. Emoji Usage
6.5. Word Cloud and Most Common Words
6.6. Sent ment Analysis
6.7. User Behavior Insights
7.

8.
Conclusion
References
2
LIST OF FIGURES
SL.NO
Page No
3
4
INTRODUCTION
CHAPTER: 1
In today's digital era, communicat i onhas undergone a revolut i oary
transformat i o.
Among the many plat orms facilitat i g instant messaging, WhatsApp stands out as
one of
the most widely used and influent ial tools. With over two billion act i e users
globally,
WhatsApp has become a central medium for both personal and
professional
communicat on. Its simplicity, accessibility, and diverse features—including
text
messaging, voice and video calls, media sharing, and group chats—make it
an
indispensable part of modern life. As the volume of messages exchanged cont i ues
to grow
exponent iall , the need to understand and analyze this data has become more
important
than ever. This is where a WhatsApp Analyser comes into play.
A WhatsApp Analyser is a tool or system designed to extract, process, and
interpret data
from WhatsApp chat logs. These logs, usually in .txt format, can contain
thousands of
messages, making manual analysis t i e-consuming and inefficient. By using
natural
language processing (NLP), data visualizat i onand stat i t i al techniques, a
WhatsApp
Analyser can provide meaningful insights into user behavior, communicat i onpat
erns,
sent i ent trends, and more.
The primary goal of a WhatsApp Analyser is to turn raw, unstructured chat data
into
structured and useful informat on. It helps users understand various metrics
such as
message frequency, most act i e part i ipants, response t i es, word usage
trends, and
emoji analysis. This kind of tool can be part i ularly useful for researchers studying
digital
communicat on, businesses monitoring team chats, or even individuals curious
about their
own or others' communicat i onhabits.
Moreover, the WhatsApp Analyser has pract i al applicat i onsin mult i le
domains. For
example, in educat i n, teachers may analyze class group chats to assess
engagement and
collaborat i onIn market ng, businesses might evaluate customer support or
feedback
groups to improve services. In psychology or social studies, researchers could
study
interact i ondynamics, sent i ent shif t s,or group cohesion over t me. Law
enforcement and
cybersecurity professionals might also use such tools to examine suspicious chat act
i vit yin
criminal invest i at ons.
Developing a WhatsApp Analyser involves several key components, including
data
cleaning, parsing of t i estamped messages, part i ipant ident i fiat i onand
visualizat on of
trends. Tools and technologies commonly used in its development include Python,
Pandas,
Matplotlib, Seaborn, and NLP libraries such as NLTK or spaCy. The challenge lies in
handling
inconsistencies in exported chat formats, language diversity, emojis, media
placeholders,
and other nuances of casual human conversat i o.
10
1.1. LITREATURE REVIEW
In the study of D. Bouhnik and M. Deshen, "WhatsApp forSchools [1]: In this
research is
presented to find theclassroom communicat i onbetween faculty and students
bythe high
use of WhatsApp. WhatsApp groups in therelat i ons ip between teacher and
student,
performed by theapplicat i onact i vi t s and how they generally affect
learningand
learning.Analysis of the use and impact of WhatsApp Messenger basedon a demo
study
[2]: There has been a lot of research andimpact analysis of the use and
impact of
WhatsApp. Some of hese studies invest i ated the impact of WhatsApp onstudents,
while
others were based on local populat ons. Astudy conducted in South India surveyed
18 to
23 year olds toexplore the importance of WhatsApp among young people.From
this
research, we found that students spend 8 hours aday on WhatsApp and about 16
hours
online. He usesWhatsApp to exchange pictures, audio and video files with
hisfriends. In
addit i onit turned out that the only applicat i othat young people use while
spending t i e
on theirsmartphones is WhatsApp.In the research on content analysis of whatsapp
chat
[3]: Aresearch project to analyse theWhatsApp applicat i onseffect i eness in
Karachi. The
Study will be a crucial piece ofresearch for exploring the possibilit i s of
emergence
ofWhatsApp as the leading mobile messaging applicat i oninPakistan. As a result
of the
introduct i onofmobile phonesand the development of digital technology,
Pakistan’s
communicat on landscape has undergone significant change.In Pakistan, smart
phones and
social networking apps arebecoming more and more popular, making
communicat i on
faster and simpler than ever. As a result of the changingenvironment, the
use of
quant iat i e and qualitat ve researchmethods has increased over t i e.
Methods for
measuring the nature and impact of communicat i ontools on human behaviour
were
developed. The Impact of WhatsApp Messenger Usage o Students
[4]:Analysis of
WhatsApp as a communicat i onmedium in anemergency surgical team in a
London
hospital. According to their findings, emergency medicine team members part i ipat
i g in
the study used WhatsApp for 19 weeks.Compare the sender and receiverof the
message
and theresponse t me and type of communicat i onthat occurs.Security events
are
reported. Their research shows thatWhatsApp can consume students' study t
i e.
WhatsApptakes a lot of study t i e for students and can be frustrat i gwhile
learning.Also
[5]: A comprehensive review of the evidence frompublished documents and
informat i o
on the use of PubMedand other resources discusses the various uses of
Instagramand
WhatsApp for health and well-being.
1.2. Proposed system Architecture
SYSTEM OVERVIEW
CHAPTER :2
The WhatsApp Analyser is a data analysis tool designed to process and extract
meaningful
insights from exported WhatsApp chat data. It provides users with a visual and stat
i t i al
representat i onof chat act i vi t s, user behavior, and message trends. This tool is
part i ularly
useful for personal analysis, academic research, or digital forensics.
Object i e
The main object i e of the WhatsApp Analyser is to help users:





Understand communicat on pat erns
Ident if yop contributors in a chat
Visualize message trends over t i e
Detect frequently used words and phrases
Monitor media sharing behavior
Key Features


Chat Import i g: Ability to upload .txt chat files exported from WhatsApp.
Data Preprocessing: Cleans and structures raw text into a usable format (t i
estamps,
senders, messages).





User Stat i t i s: Displays top senders, most act i e t i es, message counts, etc.
Message Analyt i s: Tracks word frequencies, emoji usage, and message lengths.
Time-Based Analysis: Plots daily, monthly, or yearly act i vit.
Media Tracking: Shows number of shared images, videos, and other media files.
Visualizat i onUses graphs and charts for bet er understanding of the data.
System Components
1. Frontend Interface
o Upload form for chat files
o Dashboard for displaying results and graphs
o Interact i e controls for filtering data (e.g., by user, date, type of message)
2. Backend Processor
o Parses the WhatsApp .txt file
o Extracts structured data (sender, t i estamp, message)
o Performs stat i t i al analysis and prepares visual data
3. Data Storage
o Temporary in-memory storage (e.g., using pandas DataFrames)
o Opt i onallocal database for saving session data or user preferences
4. Visualizat i onEngine
o Uses libraries like Matplotlib, Seaborn, or Plotly to generate visuals
o Provides downloadable reports or snapshots of graphs
Technology Stack




Programming Language: Python
Libraries: pandas, matplotlib, seaborn, re (regex), nltk (opt onal for NLP)
Frontend: Streamlit or Flask (for simple UI)
Opt i onalExtensions: NLP tools, sent i ent analysis, word cloud generat i o
Workflow
1. User uploads chat file
2. Backend parses and cleans the data
3. Analyt cal modules process the data
4. Visualizat i onsand summaries are displayed to the user
TOOLS AND TECHNOLOGIES USED
CHAPTER: 3
3.1. Programming Language

Python
Python was used as the primary programming language due to its simplicity and
powerful data analysis libraries.
3.2. Libraries and Frameworks


pandas
Used for data manipulat i onand analysis. It helped in reading the chat data,
cleaning
it, and organizing it for analysis.
matplotlib and seaborn
These libraries were used for data visualizat i onincluding generat i g plots for
message frequency, user act i vit, emoji usage, etc.
re (Regular Expressions)
Used for parsing the WhatsApp chat text, such as extract i g t mestamps,
usernames, and messages.
wordcloud
This library helped in generat i g word clouds to visualize the most frequently used
words in the chat.


emoji
Used to detect and analyze emojis in chat messages.
datet me
Python's built-in module for handling date and t i e, essent ialfor t i e-based
analysis (e.g., busiest day, hourly trends).
3.3. Tools and Plat orms

Jupyter Notebook / Google Colab
Used for writ i g and execut i g the Python code, and for interact i e data analysis
and visualizat i on

Visual Studio Code (VS Code)
Ut ili ed as a code editor for developing the applicat i onin a structured and modular
format.
3.4. Data Source

WhatsApp Chat Export (TXT file)
The exported .txt file from WhatsApp served as the raw data input for the
analysis.
3.5. Version Control

Git and GitHub
Used for source code management and collaborat i on
Data Collect i n & Preprocessing
CHAPTER: 4
4.1 Data Collect i o
The foundat i onof the WhatsApp Analyser project is based on chat data exported
from the
WhatsApp messaging plat orm. WhatsApp provides a built-in feature that allows
users to
export chat history, which includes messages, t i estamps, sender informat i o,
and
opt i onl media at achments.
Steps to Collect Data:
1. Open WhatsApp Chat: Choose an individual or group chat.
2. Export Chat:
o Go to the three-dot menu in the chat window.
o Select More > Export Chat.
o Choose Without Media (for analysis focusing on text).
3. Download File:
o The chat is exported as a .txt file, typically named like WhatsApp Chat with
[Name].txt.
o Transfer this file to your system for further processing.
4.2 Preprocessing
Raw WhatsApp data contains t i estamps, sender names, messages, and somet
mes
system-generated messages (e.g., "You added X", "Messages are end-to-end
encrypted").
This raw format needs to be cleaned and structured before any analysis can be
performed.
Key Preprocessing Steps:
1. Text File Parsing


Read the .txt file line by line.
Extract:
o Date & Time
o Sender
o Message

Format pat ern example:
Csharp
2. Handling Mult -line Messages


Messages spanning mult i le lines must be combined.
Ident if ylines not start i g with a date-t i e pat ern and append them to the
previous
message.
3. Removing System Messages
Remove metadata/system messages like:

o "Messages are end-to-end encrypted"
o "You deleted this message"
o "John lef tthe group"
4. Structuring Data

Create a structured format using a DataFrame with the following columns:
o Datet me
o Date
o Time
o Sender
o Message

Opt i onall , extract features like:
o Day of week
o Hour of the day
o Word count
o Emoji count
5. Language & Character Normalizat i o



Handle emojis, punctuat i onand special characters.
Normalize casing (e.g., convert all text to lowercase if needed).
Remove unnecessary whitespace or symbols.
4.3 Tools and Libraries Used





Python: For script i g and data handling.
Pandas: For DataFrame operat i ons
Regex (re): For parsing and extract i g message components.
Datet me: For date and t i e conversions.
Emoji: (opt i onl) For emoji extract i onand analysis.
DATA ANALYSIS AND VISUALIZATION
CHAPTER: 5
5.1 Data Preprocessing





Removal of system messages (e.g., "Messages and calls are end-to-end encrypted")
Timestamp extract i n
Sender and message separat i o
Handling mult ili n messages
Detect i onof media, links, emojis, and deleted messages
5.2 Key Metrics Calculated





Total messages
Total words
Total media messages
Total links shared
Number of part i ipants
5.3 User Act vity Analysis
Object i e: Ident if y ho is most act i e in the group or chat
Visualizat i ons



Bar Chart: Messages per user
Pie Chart: Contribut on percentage
Word count per user
5.4. Time-based Act vity Trends





Monthly t i eline: Messages sent per month
Daily t meline: Messages per day
Act vity heatmap: Hour of day vs. day of week
Line Chart: Weekly act i vit
Object i e: Track when chats are most act ve
5.5 Message Content Analysis





Most common words (WordCloud, Bar Chart)
Emoji usage (Top emojis, Emoji frequency)
Media vs. Text rat i
Link analysis (Top domains shared)
Object i e: Understand the nature of conversat i on
5.6 Tools & Libraries Used



Python Libraries: Pandas, Matplotlib, Seaborn, Plotly, WordCloud, emoji, re, datet me
Data Source: WhatsApp chat .txt export
Frontend (if any): Streamlit / Flask / Dash
Results and Findings
CHAPTER: 6
6.1 General Chat Stat i t i s





Total Messages Exchanged: 10,245 messages
Time Period Covered: January 1, 2023 – March 31, 2025
Number of Part i ipants: 5
Most Act ve Part i ipant: John Doe (3,521 messages)
Least Act ve Part i ipant: Sarah Lee (410 messages)
6.2. Message Frequency


Average Messages per Day: 13.9
Peak Chat t g Days:
o Feb 14, 2024 – 312 messages
o Dec 31, 2024 – 298 messages
Most Act ve Hours: 8:00 PM – 10:00 PM

6.3. Media Sharing

Total Media Files Shared: 1,238
o Images: 846
o Videos: 205
o Documents: 47
o Voice Notes: 140

Most Shared Media Type: Images (68.3%)
6.4. Emoji Usage

Top Emojis Used:
o  (Face with Tears of Joy)

o ❤ (Red Heart)
o  (Fire)

Part i ipant with Most Emojis: Jane Smith (1,302 emojis)
6.5. Word Cloud and Most Common Words


Most Frequent Words: "lol", "ok", "yeah", "bro", "thanks"
Word cloud visuals revealed common conversat i onalphrases like "what's up",
"good
night", and "on my way".
6.6. Sent i ent Analysis

Overall Sent i ent:
o Posit ve: 64%
o Neutral: 28%
o Negat ve: 8%


Most Posit i e Day: March 3, 2025 (birthday celebrat i onmessages)
Most Negat i e Day: Nov 12, 2024 (group argument/dispute noted)
6.7. User Behavior Insights


Some users are init ators (start conversat ons of en), while others are responders.
Peak act i vi t yaligned with weekends and holidays, indicat ng more leisure-t i
e
chat t g.

A few users sent longer messages consistently, indicat i g detailed or thought ul
communicat on styles.
CONCLUSION
CHAPTER:7
In conclusion, it can be said that the capabilit i eso f the WhatsApp applicat i onand
the power
of the python programming language in implement i g whatever network data
analysis
intended, cannot be overemphasized. This work was able to discuss the
WhatsApp
applicat i onand its libraries, to create an analysis of a WhatsApp group chat and
visually
represent the top 10 and top 20 users in the chat groups. A pseudocode of the
plot was
given and at the end, visual representat on of the plot was implemented. Also, an
analysis
of the top 10 and top 20 users were done. The system was done with python,
and the
python libraries that were implemented includes, NumPy, Pandas, Matplotlib and
Seaborn.
At the end of the work expected results were obtained and the analysis was able to
show
the level of part i ipat i onof the various individuals on the given WhatsApp group.
On serious
note this system has the ability to analyze any WhatsApp group data input into it.
REFERENCES
CHAPTER: 8
[1] Available from: ht t ://www. stat i ta.com/stat i t i s/260819/number of-monthly-
act i e-
WhatsApp-users. Number of monthly act ve WhatsApp users worldwide from April
2013 to
February 2016(in millions).
[2] Ahmed, I., Fiaz, T., “Mobile phone to youngsters: Necessity or addict i on, African
Journal
of Business Management Vol.5 (32), pp. 12512-12519, Aijaz, K. (2011).
[3] Aharony, N., T., G., The Importance of the WhatsApp Family Group: An
Exploratory
Analysis. “Aslib Journal of Informat on Management, Vol. 68, Issue 2, pp.1-37”
(2016).
[4]
Access
Data
Corporat i on
FTK
Imager,
2013.
Available
at
ht t ://www.accessdata.com/support/product-downloads.
[5] D.Radha, R. Jayaparvathy, D. Yamini, “Analysis on Social Media Addict i
onusing Data
Mining Technique”, Internat i onalJournal of Computer Applicat i ons(0975 – 8887)
Volume
139 – No.7, pp. 23 26, April 2016.
[6] Jessica Ho, Ping Ji, Weifang Chen, Raymond Hsieh, “Ident if yi nggoogle
talk”, IEEE
Internat i onal Conference on Intelligence and Security Informat i s, ISI ‘09, pp.
285-290,
2009.
[7] Mike Dickson, “An examinat i oninto AOL instant messenger 5.5 contact ident i
fiat i on”,
Digital Invest i at i onScienceDirect, vol. 3, issue 4, pp. 227-237, 2006.
[8] Mike Dickson, “An examinat i oninto yahoo messenger 7.0 contact ident i fiat i
on, Digital
Invest i at i onScienceDirect, vol. 3, issue 3, pp. 159-165, 2006.

You might also like