PSG INSTITUTE OF TECHNOLOGY
AND APPLIED RESEARCH COIMBATORE - 641062
(Autonomous)
NM 1074 - Project Work
CUSTOMER CHURN PREDICTION
TEAM MEMBERS PROJECT GUIDE
TARUNSWAMY MURALIDHARAN (715523244053)
Dr.Vaishnavi S
SACHIN RAM E S (715523244039)
ASSISTANT
VISAKAN M (715523244058)
PROFESSOR/CSE
PRAVEEN R (715523244302)
PROBLEM STATEMENT:
PROBLEM STATEMENT:
● High customer churn affects profitability in online food delivery services.
● Users frequently switch platforms due to better offers, poor service, or delivery
delays.
● Existing systems lack predictive capability to identify at-risk customers.
● Traditional analysis fails to capture behavioral and demographic churn patterns.
● Need for a machine learning-based system to forecast churn accurately.
OBJECTIVES:
● Predict customer churn in online food delivery platforms using machine learning.
● Identify key churn drivers such as delivery satisfaction, income, offers, and food
quality.
● Analyze structured survey data to uncover customer behavior patterns.
● Build accurate classification models (Logistic Regression, Random Forest, KNN).
● Provide actionable insights to support targeted customer retention strategies.
● Deploy a user-friendly interface for real-time churn prediction and decision support.
DATASET DESCRIPTION:
Source: Public dataset from Kaggle related to online food delivery services in Bangalore.
Type: Structured survey-based dataset with labeled churn outcomes.
Records: Includes several hundred customer entries with detailed attributes.
Target Variable:
● Churn – Indicates whether a customer has stopped using the service (Yes/No).
Key Features:
● Demographics: Age, gender, marital status, occupation, income level.
● Behavioral Data: Frequency of orders, self-cooking habits, health concerns.
● Service Feedback: Delivery satisfaction, food quality, offer expectations.
● App Usage Patterns: Ordering preferences, satisfaction levels, usability scores.
Nature of Data: Static (not updated in real time) but rich in behavioral insights.
PROJECT WORKFLOW:
1. Problem Identification
→ Churn is affecting online food delivery businesses' revenue and customer loyalty.
2. Data Collection
→ Dataset downloaded from Kaggle with demographic, behavioral, and feedback data.
3. Data Preprocessing
→ Cleaned missing values, encoded categories, and scaled numeric values.
4. Exploratory Data Analysis (EDA)
→ Used graphs and heatmaps to identify churn patterns and feature correlations.
5. Feature Engineering
→ Created new satisfaction scores and transformed raw inputs for better prediction.
6. Model Building
→ Trained Logistic Regression, Random Forest, and KNN models.
7. Model Evaluation
→ Evaluated using accuracy, precision, recall, F1-score, and confusion matrix.
8. Deployment
→ Developed a Streamlit & React interface, connected to Flask API for predictions.
9. Visualization Dashboard
→ Built dashboards to display predictions and churn insights clearly.
10. Outcome
→ Achieved up to 97% accuracy; system ready for real-time churn monitoring and decision-making.
Data Preprocessing
Data Cleaning
In this phase, the raw dataset is refined by:
● Removing duplicate records and irrelevant entries.
● Correcting inconsistencies in data labels or format (e.g., inconsistent
spelling or units).
Detecting and treating outliers that could skew the model’s learning process.
This step ensures the dataset is reliable and clean for further analysis.
Data Visualization
Feature Engineering
To enhance the dataset’s predictive power, feature engineering techniques are applied. This
includes:
● Creating new features such as satisfaction score indexes derived from survey
responses.
● Aggregating and transforming raw inputs into more informative metrics.
● Encoding multi-level responses (e.g., "Very Satisfied", "Neutral", "Unsatisfied") into
numeric scales.
Well-engineered features directly contribute to improved model accuracy and
generalization.
Model Evaluation
Algorithm Functionality in the Churn Prediction Pipeline
🔹 Logistic Regression
● Used as a baseline model for predicting whether a customer will churn (Yes/No).
● Fits a logistic curve to customer features like delivery satisfaction, income, and food quality.
● Outputs a probability score; if > 0.5 → predicts "Churn".
🔹 Random Forest
● Built multiple decision trees using different parts of the dataset.
● Each tree predicts churn or no churn; final output is based on majority voting.
● Highly accurate for our dataset due to its ability to capture non-linear relationships (like behavior + demographics).
🔹 K-Nearest Neighbors (KNN)
● For any new customer, the model finds the k most similar past customers.
● Looks at how those similar customers behaved — if most of them churned, it predicts churn.
Churn Distribution by Demographics
Customer Churn Analysis Based on Demographic Features
Gender-Based Churn: Slightly higher churn among females; gender is not a strong churn predictor but may
suggest a need for service personalization.
Marital Status: Single customers churn more, while married ones show higher retention, indicating greater
stability.
Occupation: Students and self-employed individuals have higher churn rates; employees retain better, likely
due to income stability.
Monthly Income: Churn is higher among customers earning below ₹25,000; those earning over ₹50,000
show stronger retention.
Retention Strategy Tip: Offer loyalty perks to students/self-employed and budget-friendly plans for lower-
income groups.
Evaluation Metrics Across Iterations
● The chart compares accuracy,
precision, and recall over multiple
training iterations, helping assess
model stability and improvement.
● Precision remains consistently
high, indicating that most predicted
churns are accurate, while recall
fluctuates due to class imbalance.
● The balanced improvement in all
three metrics across iterations
shows effective tuning of the
model to optimize both
performance and reliability.
Heatmap:
Heatmap Analysis: Feature Correlation Matrix
● This heatmap visualizes the correlation between all features in the dataset used for churn prediction.
● Strong positive correlations (green) and negative correlations (red) help identify which factors move
together.
● Features like late delivery, bad past experience, discount dependency, and poor food quality show
high correlation with churn (Output).
● The matrix also helps eliminate redundant or less impactful variables during feature selection.
● Such visual analysis improves model performance and provides business insights for targeting key pain
points.
BACK-END (PYTHON) :
from flask import Flask, request, jsonify
from flask_cors import CORS
import pandas as pd
import joblib
app = Flask(__name__)
CORS(app)
model = joblib.load("model.pkl")
features = ["Age","Health Concern","Self Cooking","Good Food quality","Late Delivery","More
Offers and Discount"]
@app.route('/predict', methods=['POST'])
def predict():
try:
data = request.get_json()
input_df = pd.DataFrame([data], columns=features)
prediction = model.predict(input_df)[0]
return jsonify({'prediction': str(prediction)})
except Exception as e:
import traceback
traceback.print_exc()
return jsonify({'prediction': 'Prediction failed.'})
if __name__ == '__main__':
app.run(debug=True)
FRONT-END :
import React, { useState } from 'react';
import styles from './ChurnForm.module.css';
import backgroundImage from './123.jpg';
const ChurnForm = () => {
const [formData, setFormData] = useState({
Age: '',
HealthConcern: '',
SelfCooking: '',
GoodFoodQuality: '',
LateDelivery: '',
MoreOffersDiscount: '',
});
const [prediction, setPrediction] = useState(null);
const [loading, setLoading] = useState(false);
const [showPopup, setShowPopup] = useState(false);
const handleChange = (e) => {
const { name, value } = e.target;
setFormData(prev => ({ ...prev, [name]: value }));
};
const interpretPrediction = (pred) => {
if (pred === "0" || pred === 0) return "No, customer will NOT leave.";
if (pred === "1" || pred === 1) return "Yes, customer WILL leave.";
return "Prediction unclear.";
};
const handleSubmit = async (e) => {
e.preventDefault();
setLoading(true);
const payload = {
Age: parseInt(formData.Age),
"Health Concern": parseInt(formData.HealthConcern),
"Self Cooking": parseInt(formData.SelfCooking),
"Good Food quality": parseInt(formData.GoodFoodQuality),
"Late Delivery": parseInt(formData.LateDelivery),
"More Offers and Discount": parseInt(formData.MoreOffersDiscount),
};
try {
const response = await fetch('http://localhost:5000/predict', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(payload)
});
const data = await response.json();
setPrediction(data.prediction);
setShowPopup(true);
} catch (error) {
setPrediction('Error: Could not get prediction');
setShowPopup(true);
console.error(error);
} finally {
setLoading(false);
}};
const closePopup = () => {
setShowPopup(false);
};
return (
<div
className={styles.fullScreenContainer}
style={{
backgroundImage: `url(${backgroundImage})`,
backgroundSize: 'cover',
backgroundPosition: 'center',
minHeight: '100vh',
}}
>
<div className={styles.formContainer}>
<h2>CUSTOMER CHURN PREDICTION</h2>
<form onSubmit={handleSubmit}>
{[
{ label: 'Age', name: 'Age', min: 18, max: 100 },
{ label: 'Health Concern (1-5)', name: 'HealthConcern', min: 1, max: 5 },
{ label: 'Self Cooking (1-5)', name: 'SelfCooking', min: 1, max: 5 },
{ label: 'Good Food quality (1-5)', name: 'GoodFoodQuality', min: 1, max: 5 },
{ label: 'Late Delivery (1-5)', name: 'LateDelivery', min: 1, max: 5 },
{ label: 'Offers and Discount (1-5)', name: 'MoreOffersDiscount', min: 1, max: 5 }
].map(({ label, name, min, max }) => (
<div key={name} className={styles.inputGroup}>
<label htmlFor={name}>{label}</label>
<input
id={name}
type="number"
name={name}
min={min}
max={max}
required
value={formData[name]}
onChange={handleChange}
/>
</div>
))}
<button type="submit" disabled={loading}>
{loading ? 'Predicting...' : 'PREDICT CHURN !!'}
</button>
</form>
</div>
{showPopup && (
<div className={styles.popupOverlay}>
<div className={styles.popupContent}>
<button onClick={closePopup} className={styles.closeButton}>×</button>
<h3>Prediction Result</h3>
<p className={styles.predictionText}>{interpretPrediction(prediction)}</p>
</div>
</div>
)}
</div>
);
};
export default ChurnForm;