0% found this document useful (0 votes)
12 views16 pages

SML 1

Uploaded by

freefire1523143
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views16 pages

SML 1

Uploaded by

freefire1523143
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Sentiment Analysis

Project Based Learning (PBL) Report


for the course
Statistics for Machine Learning – 20MA32L01

BACHELOR OF TECHNOLOGY

IN

COMPUTER SCIENCE AND ENGINEERING

By
23R15A0525- S.Akshara
23R15A0524- R.Navaneetha
23R15A0523-P.Envitha

Under the guidance of


Dr. A. Srinivasulu

Department of Computer Science and Engineering


Accredited by NBA

Geethanjali College of Engineering and Technology


(UGC Autonomous)
(Affiliated to J.N.T.U.H, Approved by AICTE, New Delhi)
Cheeryal (V), Keesara (M), Medchal.Dist.-501 301.
JUNE-2025
TABLE OF CONTENTS

S.No. Contents Page No

1 ACKNOWLEDGEMENT 1

2 ABSTRACT 2

3 INTRODUCTION 3

4 SYSTEM DESIGN 7

5 IMPLEMENTATION 8

6 SAMPLE CODE 9

7 OUTPUT SCREENS 11

8 CONCLUSION 13

9 REFERENCES 14
ACKNOWLEDGEMENT
We would like to acknowledge and give my warmest thanks to our faculty Dr. A. srinivasulu
sir who made this work possible. Their guidance and advice carried us through all the stages
of writing our project. We would also like to thank our classmates for letting our defence
be an enjoyable moment, and for your brilliant comments and suggestions, thanks to you.
We would also like to give special thanks to our families as a whole for their continuous
support and understanding when undertaking my research and writing my project and
providing the required equipment. The project would not have been successful without their
Cooperation and inputs.

1
ABSTRACT

This project presents a foundational approach to sentiment analysis using deep learning with
TensorFlow and Keras. Sentiment analysis is a key task in Natural Language Processing (NLP) that
involves determining the emotional tone behind text data. The goal of this project is to classify short
sentences into positive or negative sentiments. A small custom dataset is created with labeled
sentences expressing either positive or negative emotions. The preprocessing phase includes
tokenizing the text using Keras’ Tokenizer and padding the sequences to ensure uniform input
lengths for the neural network.

The model is a simple sequential neural network that processes the tokenized text and learns to
identify sentiment patterns. It is trained using binary classification techniques to predict whether a
given sentence has a positive (label 1) or negative (label 0) sentiment. Through this implementation,
the project demonstrates key steps such as data preparation, text vectorization, neural network
construction, and training for binary sentiment classification.

Although this is a basic implementation, it effectively introduces important concepts in text


classification and serves as a practical guide for beginners in machine learning and NLP. The project
can be further extended by incorporating a larger dataset, using word embeddings, or applying more
complex deep learning architectures like LSTM or GRU.

2
INTRODUCTION

About the project

In today's digital era, vast amounts of textual data are generated daily through social media, product
reviews, forums, and other online platforms. Analyzing the sentiment behind this text helps
businesses, researchers, and developers understand public opinion, improve customer experiences,
and make informed decisions. Sentiment analysis, a subfield of Natural Language Processing (NLP),
involves determining whether a piece of text expresses a positive, negative, or neutral sentiment.

This project aims to build a basic sentiment analysis model using Python and deep learning libraries
such as TensorFlow and Keras. The model classifies short sentences into binary categories: positive
or negative sentiment. A small, manually created dataset is used to train the model, with each
sentence labeled accordingly. The workflow includes text preprocessing through tokenization and
sequence padding, followed by the development and training of a neural network model.

The purpose of this project is to provide a practical and educational implementation of sentiment
analysis suitable for beginners. It introduces core concepts such as text-to-sequence conversion,
neural network design, and binary classification. While the model is simple, it lays the foundation
for more advanced techniques in NLP. The project can be enhanced by using larger datasets, pre-
trained word embeddings, or more sophisticated models like LSTMs or Transformers.

Project outcomes and objectives

1. Understand Sentiment Analysis Concepts


To gain a clear understanding of what sentiment analysis is and how it is used in real-world
applications.

3
2. Implement Text Preprocessing Techniques
To learn and apply preprocessing steps such as tokenization and padding, preparing raw
text for deep learning models.
3. Develop a Binary Sentiment Classification Model
To build a neural network using TensorFlow and Keras that can classify text into positive
or negative sentiments.
4. Train and Evaluate the Model
To train the model on a small labeled dataset and assess its performance using appropriate
metrics.
5. Provide a Simple Educational NLP Solution
To create a beginner-friendly implementation that demonstrates the basic steps in building
a text classification model using deep learning.

Project Outcomes

 A working sentiment analysis model capable of predicting positive or negative sentiment


from short text inputs.
 A clear understanding of how to preprocess textual data for machine learning.
 Practical experience in building and training neural networks using Keras.
 A foundation for more advanced NLP tasks such as multi-class sentiment analysis, emotion
detection, or model deployment.
 A Jupyter Notebook that serves as a learning tool for future NLP or machine learning
projects.

4
Key Features

1. Text Preprocessing Module

Function:
Prepares raw textual data for analysis by converting text into a numerical format usable by machine
learning models.

Key Features:

 Tokenization: Converts words into integer sequences using Keras Tokenizer.


 Padding: Ensures consistent input lengths using pad_sequences.
 Vocabulary Generation: Builds a word index to maintain consistent encoding.

2. Sentiment Classification Model Module

Function:
Trains a deep learning model to classify text inputs as positive or negative sentiments.

Key Features:

 Neural Network Structure: Uses Embedding, Flatten, and Dense layers for classification.
 Training with Labels: Learns sentiment patterns from labeled training data.
 Binary Output: Outputs a probability between 0 and 1 (positive vs. negative sentiment).

3. Inference and Prediction Module

Function:
Uses the trained model to predict sentiment for new, unseen text inputs.

5
Key Features:

 Dynamic Input Handling: Accepts new text, tokenizes, and pads based on the original
tokenizer.
 Probability Output: Returns the model’s confidence score for positive or negative
sentiment.
 Decision Thresholding: Applies a threshold (e.g., > 0.5 is positive) to decide final output.

4. Decision-Making Module

Function:
Interprets model outputs to make final sentiment decisions and guide responses.

Key Features:

 Threshold-Based Classification: Converts prediction probabilities into labels.


 Confidence Assessment: Optionally outputs prediction certainty to the user.
 Rule-Based Actions: Could trigger different responses or actions based on sentiment (e.g.,
alert on negative sentiment).

5. User Interface (Optional/Future Module)

Function:
Provides a user-friendly interface for inputting text and viewing sentiment results.

Key Features:

 Input Box for Text: Allows users to enter custom sentences.


 Prediction Display: Shows whether the sentiment is positive or negative along with
confidence.

6
SYSTEM DESIGN

Software Requirements

Operating System : Microsoft Windows

Software Name : Jupyter Notebook

Type : IDE

Developers : Fernando Pérez

Hardware Requirements

Device name : DESKTOP-0OGA6I1

Processor : AMD Ryzen 3 3250U with Radeon Graphics 2.60 GHz

Installed RAM : 8.00 GB (5.94 GB usable)

Device ID : 76DDCDEE-6C4D-43DB-99D9-4E23080623F7

System type : 64-bit operating system, x64-based processor

7
IMPLEMENTATION
Modules Implementation

Text Preprocessing Module


# Define input sentences and labels
sentences = [
"I love this product",
"This is the best thing ever",
"Absolutely fantastic experience",
"I hate this",
"This is the worst",
"Terrible and disappointing"
]
labels = [1, 1, 1, 0, 0, 0]
# Tokenization
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

tokenizer = Tokenizer()
tokenizer.fit_on_texts(sentences)
sequences = tokenizer.texts_to_sequences(sentences)
# Padding
padded_sequences = pad_sequences(sequences, padding='post')

Sentiment Classification Model Module


# Define features and labels
X = padded_sequences
y = labels
# Build model
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, Flatten, Dense

model = Sequential()
model.add(Embedding(input_dim=len(tokenizer.word_index)+1, output_dim=8,
input_length=X.shape[1]))
model.add(Flatten())
model.add(Dense(16, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# Compile and train model
model.compile(optimizer='adam', loss='binary_crossentropy',
metrics=['accuracy'])
model.fit(X, y, epochs=10)

8
Inference and Prediction Module
# Predict sentiment of new text
new_text = ["I really enjoy this"]
new_seq = tokenizer.texts_to_sequences(new_text)
new_pad = pad_sequences(new_seq, maxlen=X.shape[1], padding='post')
prediction = model.predict(new_pad)
print("Positive" if prediction[0][0] > 0.5 else "Negative")

Sample Code

import numpy as np

texts = ["I love programming",


"Python is awesome",
"I hate bugs",
"Debugging is fun",
"I love solving problems",
"I don't like errors"]
labels = [1, 1, 0, 1, 1, 0]

from tensorflow.keras.preprocessing.text import Tokenizer


from tensorflow.keras.preprocessing.sequence import pad_sequences

tokenizer = Tokenizer()
tokenizer.fit_on_texts(texts)

sequences = tokenizer.texts_to_sequences(texts)
sequences

texts

['I love programming',


'Python is awesome',
'I hate bugs',
'Debugging is fun',
'I love solving problems',
"I don't like errors"]

9
max_length = max([len(sequence) for sequence in sequences])
max_length

X = pad_sequences(sequences, maxlen=max_length, padding='post')


X

y = np.array(labels)
y

from tensorflow.keras.models import Sequential


from tensorflow.keras.layers import Embedding, Dense, Flatten

model = Sequential()
model.add(Embedding(input_dim=len(tokenizer.word_index) + 1,
output_dim=8,
input_length=max_length))

model.add(Flatten())
model.add(Dense(10, activation='relu'))

model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

model.fit(X, y, epochs=20, batch_size=2)

sample_text = "i love programming"


sample_sequence = tokenizer.texts_to_sequences([sample_text]) # Tokenize the sample text
sample_padded = pad_sequences(sample_sequence, maxlen=max_length, padding='post') # Pad
the sequence
prediction = model.predict(sample_padded)
if prediction > 0.5:
print('positive')
else:
print('negative')
print(prediction[0][0])

10
Sample Output

11
12
CONCLUSION

This sentiment analysis project successfully demonstrates how natural language processing (NLP)
and deep learning techniques can be applied to classify text data into positive or negative sentiments.
Through a structured and modular approach, the project walks through the essential stages of
preprocessing raw text using tokenization and padding, building and training a neural network model
with Keras, and making predictions on new text inputs.

Despite using a small custom dataset for simplicity, the project effectively highlights the workflow
of a typical machine learning pipeline—from data preparation to inference. The use of embedding
layers enables the model to understand word relationships, while the binary classification output
provides an interpretable result.

This beginner-friendly project serves as a strong foundation for more complex NLP tasks. It can be
extended further by incorporating larger and real-world datasets, more advanced model architectures
like LSTM or BERT, and deploying the model through a user interface using frameworks such as
Flask or Streamlit.

Overall, the project not only accomplishes its goal of performing basic sentiment analysis but also
offers valuable insights into the end-to-end development of an AI-based text classification system.

13
REFERENCES
https://www.eneuro.org https://www.geeksforgeeks.org/python-programming-
language/ https://stackoverflow.com/

https://ailocallist.com

14

You might also like