PROJECT REPORT
ON
CHURN PREDICTION
Project SUBMITTED IN FULFILLMENT OF THE REQUIREMENT OF DEGREE OF
BACHELOR OF COMPUTER ENGINEERING(H)
OF
Faridabad,Haryana
2021-2025
SUBMITTED TO: SUBMITTED BY:
Dr.Umesh kumar Rahul Kumar(21001050014)
(ASSO.PROFESSOR)
CERTIFICATE
It is certified that the project report titled “Churn Prediction ” being
submitted by Rahul Kumar student of BTECH(COMPUTER
ENGINEERING-hindi) (2021 to 2025) is genuine work carried
out by them under my supervision and guidelines.
This project work is the fulfillment of the requirement for the
degree of Bachelor of computer Engineering, J.C.Bose UST ,
Faridabad , Haryana.
Dr. Umesh kumar
Assistant Professor
ACKNOWLEDGEMENT
Words are indeed inadequate to convey my deep sense of gratitude to all those who have
helped me in completing this project to the best of my ability. Being a part of this project
has certainly been a unique and a very productive experience on my part.
Amongst the wide panorama of people who provided me the inspiration, guidance and
encouragement, I take this opportunity to thank those who gave me indebted assistance
and constant encouragement for completing this project.
I would like to thank Dr.Umesh Kumar, JC BOSE UST,FARIDABAD for her continuous
help in completion of this project. She motivated me and was available whenever her
assistance was sought. She was actively involved throughout the project and was also
kind enough to tell me the strengths and weaknesses and how I could improve myself to
face the corporate world. Without her support the completion of this project would be
impossible.
RAHUL KUMAR
TABLE OF CONTENTS
CANDIDATE'S DECLARATION
CERTIFICATE
ACKNOWLEDGEMENT
Introduction
This chapter gives an introduction about our motive behind making this
project and the benefits making this project and the benefits our project
offers to the users.
Project Descripition and methodology
This chapter gives an insight about technology and database used in this
Churn Prediction web Application.
Project Work Implementation
This chapter briefly explains all the functioning of project.In this,we have
shared screens available in the project and explains the usage of different
components an their implementation in the project.
CODE SNAPSOTS AND PAGE VIEW
This chapter briefly explains all the functioning of project.In this,we have
shared screens available in the project and explains the usage of different
components an their implementation in the project and we have shared the
screenshots of code.
1. INTRODUCTON
In today's competitive market landscape, retaining customers is as important as acquiring new ones.
Customer churn—defined as the loss of clients or subscribers— can significantly impact a
company's revenue and reputation. This is particularly true for industries such as
telecommunications, banking, subscription services, and e- commerce, where customer loyalty is
vital for sustained growth. Predicting churn helps businesses identify customers who are likely to
leave and allows them to take proactive measures to retain them. Effective churn prediction not only
mitigates revenue loss but also reduces the cost associated with acquiring new customers, which is
typically higher than retaining existing ones.
Customer churn can occur due to various factors, including poor service quality, lack of
engagement, competitive offerings, or dissatisfaction with the product. Understanding these factors
and predicting which customers are at risk of leaving is a strategic advantage. By analyzing
historical data and customer behavior, companies can detect early signs of churn and implement
targeted retention strategies, such as personalized offers or improved customer service. This project
aims to harness the power of data analytics and machine learning to create a robust churn prediction
model that provides actionable insights.
Churn prediction models often use machine learning algorithms that analyze patterns in customer
behavior, such as purchase frequency, engagement levels, and service interactions. These insights
help businesses anticipate churn and implement timely interventions, ensuring long-term customer
relationships and sustainable growth.
Whether in telecommunications, banking, e-commerce, or subscription services, churn prediction
has become a cornerstone of modern customer relationship management. For industries that thrive
on recurring revenue—such as telecommunications, e-commerce, subscription services, and
banking—minimizing churn is essential to sustaining growth and profitability
2.Project Description and Methodology
Python:-Python is used for server-side web development, software development, mathematics, and system
scripting, and is popular for Rapid Application Development and as a scripting or glue language to tie
existing components because of its high-level, built-in data structures, dynamic typing, and dynamic
binding. Program maintenance costs are reduced with Python due to the easily learned syntax and
emphasis on readability.
Jupyter Notebook:-Jupyter Notebook is an open-source, interactive computing environment that allows
users to create and share documents containing live code, visualizations, equations, and narrative text. It
supports multiple programming languages (like Python, R, and Julia) and is widely used for data analysis,
machine learning, scientific computing, and educational purposes. Jupyter's cell-based structure enables
incremental code execution, facilitating real-time feedback and collaboration.
Visual studio Code:-
Visual Studio is an integrated development environment (IDE) from Microsoft. It is used to
developcomputer programs including websites, web apps, web services and mobile apps. Visual Studio
uses Microsoft software development platforms such as Windows API, Windows Forms, Windows
Presentation Foundation, Windows Store and Microsoft Silverlight. It can produce both native code and
managed code.
Anaconda Navigator :-
Anaconda Navigator is a user-friendly desktop interface for managing packages, environments, and
applications within the Anaconda distribution. It allows users to easily install, update, and remove data
science libraries (like NumPy and TensorFlow) without using the command line.Navigator supports
environment creation and management, enabling isolated setups for different projects, which prevents
dependencyconflicts.
3.Project Work Implementation
Project Overview
This project focuses on predicting customer churn and retention using machine
learning techniques. The model leverages logistic regression to classify customers
as retained or churned. Key components include data preparation in Excel, model
building in Python, validation using k-fold cross- validation, and an interactive
Streamlit dashboard for result visualization.
Data Preparation in Excel
Data Collection:
Compiled historical customer data, including features like demographics, subscription
history, and usage patterns.
Data Cleaning:
Removed duplicates, handled missing values, and normalized data.
Feature Engineering:
Created relevant features such as customer tenure, average spending, and engagement
scores.
Data Export:
Saved the cleaned and processed data as a CSV file for further analysis in
Python.
Model Implementation using Logistic Regression
Environment Setup:
Libraries Used: pandas, numpy, sklearn, streamlit, matplotlib, seaborn
Data Import: Loaded the CSV file into a pandas DataFrame.
Logistic Regression Steps:
Data Splitting:
Split data into training (70%) and testing (30%) sets.
Model Training:
Implemented logistic regression using scikit-learn’s LogisticRegression class.
Hyperparameter Tuning:
Adjusted parameters like C (regularization strength) and penalty terms to optimize
performance.
Model Evaluation Metrics:
Accuracy, precision, recall, F1-score, and confusion matrix.
K-Fold Cross-Validation
Objective:
Evaluate model stability and performance.
Implementation:
Used k-fold cross-validation (with k=5) to divide the data into 5 subsets. The
model was trained and tested 5 times, each time using a different subset for
testing and the remaining for training.
Mean Accuracy: Calculated the average accuracy across all folds to assess model
reliability.
Deployment with Streamlit
Purpose:
Create a user-friendly web application to visualize predictions and insights.
Streamlit Workflow:
UI Elements:
Uploaded data via file uploader.
Dropdowns and sliders for user inputs (optional).
Prediction Display:
Provided customer retention probability and classification.
Displayed performance metrics and visualizations (e.g., ROC curve, confusion matrix).
4.Snapshots of user interface:
5.Snapshots of code:
6. REFRENCES
Following sources have been sought for the preparation of this
report:
1. CHATGPT
2. DOCUMENTATION
3. TRAIN MODEL INSTRUCTION AND DOCUMNETATION ON JUPYTER NOTEBOOK
4. https://chat.openai.com/c/4502e748-f10f-4582-9d82- 1a97817b5fd5