0% found this document useful (0 votes)

50 views13 pages

Zomato Sales Analysis

The document provides a detailed analysis of a dataset from Zomato, focusing on various aspects such as restaurant ratings, types, and customer voting behavior. It includes data cleaning steps, visualizations using libraries like Seaborn, and insights derived from the data, such as the popularity of dining restaurants and average spending by couples. The analysis concludes with recommendations for Zomato based on customer preferences and ordering patterns.

Uploaded by

Aaditya Raj Pandey

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views13 pages

Zomato Sales Analysis

Uploaded by

Aaditya Raj Pandey

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

import pandas as pd

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

dataframe = pd.read_csv("Zomato data .csv")

print(dataframe)

name online_order book_table rate votes \

0 Jalsa Yes Yes 4.1/5 775
1 Spice Elephant Yes No 4.1/5 787
2 San Churro Cafe Yes No 3.8/5 918
3 Addhuri Udupi Bhojana No No 3.7/5 88
4 Grand Village No No 3.8/5 166
.. ... ... ... ... ...
143 Melting Melodies No No 3.3/5 0
144 New Indraprasta No No 3.3/5 0
145 Anna Kuteera Yes No 4.0/5 771
146 Darbar No No 3.0/5 98
147 Vijayalakshmi Yes No 3.9/5 47

approx_cost(for two people) listed_in(type)

0 800 Buffet
1 800 Buffet
2 800 Buffet
3 300 Buffet
4 600 Buffet
.. ... ...
143 100 Dining
144 150 Dining
145 450 Dining
146 800 Dining
147 200 Dining

[148 rows x 7 columns]

Now we will read this data into our jupyter notebook

dataframe

name online_order book_table rate votes \

approx_cost(for two people) listed_in(type)

0 800 Buffet
1 800 Buffet
2 800 Buffet
3 300 Buffet
4 600 Buffet
.. ... ...
143 100 Dining
144 150 Dining
145 450 Dining
146 800 Dining
147 200 Dining

[148 rows x 7 columns]

Now we will be working on this data

1st of all if we see the dataset there is a problem , like i want to remove
this /5 from the rating column . .. else everthing is fine
Convert the Data-type of Column-Rate
def handleRate(value):
value=str(value).split('/')
value=value[0];
return float(value)

dataframe['rate']=dataframe['rate'].apply(handleRate)
print(dataframe.head())

name online_order book_table rate votes \

0 Jalsa Yes Yes 4.1 775
1 Spice Elephant Yes No 4.1 787
2 San Churro Cafe Yes No 3.8 918
3 Addhuri Udupi Bhojana No No 3.7 88
4 Grand Village No No 3.8 166

approx_cost(for two people) listed_in(type)

0 800 Buffet
1 800 Buffet
2 800 Buffet
3 300 Buffet
4 600 Buffet
Sabse Pahle hamne Ek User-Defined function Banaya Jiska naam hamne rakha
"handleRate" jiske andar hamne ek value ko pass kiya . "Str" hamne kyu use kiya

kyuki ye jo datatype diya hua hain hame wo String format me hain . Now Split
function, like earlier it was written like 4.1/5 to agar aap chahte ho ki 5 yha se cut jaye
to Ye kaaam split function se hoga

Now , we Need only 4.1 so , Value= Value[0], i.e, on 0th Position 4.1 is available, uske
baad return kra diya isss value ko.
#### Now ab dekhoo hamare passs yha pe likha hain (dataframe waaala column) -->

dataframe ke andar hamare pass rate column hain jisko change krni hai iske liye
hamne apply krdiya newly made function i.e., handleRate
dataframe

name online_order book_table rate votes \

0 Jalsa Yes Yes 4.1 775
1 Spice Elephant Yes No 4.1 787
2 San Churro Cafe Yes No 3.8 918
3 Addhuri Udupi Bhojana No No 3.7 88
4 Grand Village No No 3.8 166
.. ... ... ... ... ...
143 Melting Melodies No No 3.3 0
144 New Indraprasta No No 3.3 0
145 Anna Kuteera Yes No 4.0 771
146 Darbar No No 3.0 98
147 Vijayalakshmi Yes No 3.9 47

approx_cost(for two people) listed_in(type)

0 800 Buffet
1 800 Buffet
2 800 Buffet
3 300 Buffet
4 600 Buffet
.. ... ...
143 100 Dining
144 150 Dining
145 450 Dining
146 800 Dining
147 200 Dining

[148 rows x 7 columns]

Now ham ek baar aur check krenge ki kahi koi value missing to nahi hain ? kya koi
value null to nahi hain ?
dataframe.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 148 entries, 0 to 147
Data columns (total 7 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 name 148 non-null object
1 online_order 148 non-null object
2 book_table 148 non-null object
3 rate 148 non-null float64
4 votes 148 non-null int64
5 approx_cost(for two people) 148 non-null int64
6 listed_in(type) 148 non-null object
dtypes: float64(1), int64(2), object(4)
memory usage: 8.2+ KB

saari values thik hain yha pe info function information ke liye tha

Now 1st Question

Q1. What Type Of Restaurant do the majority of Customers order from

?
Basically wo type ka restaurant ko find karna hain jisse majority customer khaaana Order krte
hain

Type of Restaurant
Note : we have to show this using Bar graph

dataframe.head()

name online_order book_table rate votes \

0 Jalsa Yes Yes 4.1 775
1 Spice Elephant Yes No 4.1 787
2 San Churro Cafe Yes No 3.8 918
3 Addhuri Udupi Bhojana No No 3.7 88
4 Grand Village No No 3.8 166

approx_cost(for two people) listed_in(type)

0 800 Buffet
1 800 Buffet
2 800 Buffet
3 300 Buffet
4 600 Buffet

jab ham head lagate hain to starting ke 5 data dikhayega

Now i want to make bar graph of it , for making this we will be using seaborn library
sns.countplot(x=dataframe['listed_in(type)'])
plt.xlabel("type of restaurant")

Text(0.5, 0, 'type of restaurant')

now i'm going to explain this two lines of code

what is countplot ?

A count plot is a type of visualization that displays the number of observations in each category
of a categorical variable.

sns.countplot(x=dataframe['listed_in(type)'])
Show the counts of observations in each categorical bin using bars. to jab hame aaisa plot
banana hota hian jha hame exact value ko count krke likhna hota hain , wha par ham countplot
ko use krenge

x=dataframe['listed_in(type)'])
x axis pe hame chahiye tha --> konse type ka restaurant hain ( listed in type)

plt.xlabel("type of restaurant")
yha par ham X axis ko label dene ke liye isse use krenge
Conclusion ---> Majority of the restaurant falls in Dinning Category
### Q2. How many votes has each type of restaurant received from customers ?

basically the question is -> kitne votes kiss- kiss type ke restaurents ko mile

dataframe.head()

name online_order book_table rate votes \

0 Jalsa Yes Yes 4.1 775
1 Spice Elephant Yes No 4.1 787
2 San Churro Cafe Yes No 3.8 918
3 Addhuri Udupi Bhojana No No 3.7 88
4 Grand Village No No 3.8 166

approx_cost(for two people) listed_in(type)

0 800 Buffet
1 800 Buffet
2 800 Buffet
3 300 Buffet
4 600 Buffet

grouped_data = dataframe.groupby('listed_in(type)')['votes'].sum()
result = pd.DataFrame({'votes': grouped_data})
plt.plot(result, c="green", marker="o")
plt.xlabel("Type of restaurant", c="red", size=20)
plt.ylabel("votes", c="red", size=20)

Text(0, 0.5, 'votes')

now explaining this code

1st) I created a variable by the name of grouped data

2nd) Ab iss dataframe me 2 column se hame matlab tha listed_intype and votes , hamne indono
ko group me krdiya aur sum krdiya

dataframe.groupby('listed_in(type)')['votes'].sum

3rd) Ab ye jo grouped data hamne banaya hain isko pass krdiya result me
The pandas DataFrame(pd.DataFrame) is a structure that contains two-dimensional data and its
corresponding labels

4th) Marker dotted chahiye tha isliye "o" pass kra , and green is for line color
5th) In x-axis i want ki labeling aaye aur likha ho type of restaurant
6th) Similarly in Y-axis votes chahiye labeling me

Conclusion --> Dinning Restaurants has received maximum votes

this is how we can find insights from datas , so that companies can make stratigies
Q3. What are the ratings that the majority of restaurants have received
?
dataframe.head()

name online_order book_table rate votes \

0 Jalsa Yes Yes 4.1 775
1 Spice Elephant Yes No 4.1 787
2 San Churro Cafe Yes No 3.8 918
3 Addhuri Udupi Bhojana No No 3.7 88
4 Grand Village No No 3.8 166

approx_cost(for two people) listed_in(type)

0 800 Buffet
1 800 Buffet
2 800 Buffet
3 300 Buffet
4 600 Buffet

plt.hist(dataframe['rate'],bins=5)
plt.title("ratings distribution")
plt.show()
Code Explanation
for histogram --> hist , dataframe ke andar (rate) ko daal diya because we have to work on this
column

bin=5 ( bar area show )

Conclusion
the majority restaurants received ratings from 3.5-4

Q4. Zomato has Observed that most couples order most of their food
Online.
What is their average spending on each order ?
Average Order Spending on Food BY Couples

dataframe.head()

name online_order book_table rate votes \

0 Jalsa Yes Yes 4.1 775
1 Spice Elephant Yes No 4.1 787
2 San Churro Cafe Yes No 3.8 918
3 Addhuri Udupi Bhojana No No 3.7 88
4 Grand Village No No 3.8 166

approx_cost(for two people) listed_in(type)

0 800 Buffet
1 800 Buffet
2 800 Buffet
3 300 Buffet
4 600 Buffet

couple_data=dataframe['approx_cost(for two people)']

sns.countplot(x=couple_data)

<Axes: xlabel='approx_cost(for two people)', ylabel='count'>

About Code
1) A variable is created with the name couple_data
2) dataframe likha then column name pass kiya approx_cost waala
3) hamne ek countplot bnaya
4) X axis me hamne pass kiya couple_data

Conclusion
The majority of Couples prefer restaurants with an approximate cost of --> 300rs.

ab company kya kregi 300rs se related hi items show krengi unke account me isse jyada ka nhi so
that ki sell acchi ho ... ( take an exapmle of Iphone budget ad)

Q5. Which mode ( Online or Offline) has received the maximum

rating ?
dataframe.head()

name online_order book_table rate votes \

0 Jalsa Yes Yes 4.1 775
1 Spice Elephant Yes No 4.1 787
2 San Churro Cafe Yes No 3.8 918
3 Addhuri Udupi Bhojana No No 3.7 88
4 Grand Village No No 3.8 166
approx_cost(for two people) listed_in(type)
0 800 Buffet
1 800 Buffet
2 800 Buffet
3 300 Buffet
4 600 Buffet

plt.figure(figsize = (6,6))
sns.boxplot(x = 'online_order', y= 'rate', data = dataframe)

<Axes: xlabel='online_order', ylabel='rate'>

code
1) figure plot kiya, size de di
2) boxplot load kiya and x & y axis pe data ko load kiya from dataframe
conclusion
clearly yes online order is maximum and offline order receives lower rating
incomparison with online mode

Q5. Which type of restaurant received more offline orders, so that

Zomato can provide those customers with some good offers?
dataframe.head()

name online_order book_table rate votes \

0 Jalsa Yes Yes 4.1 775
1 Spice Elephant Yes No 4.1 787
2 San Churro Cafe Yes No 3.8 918
3 Addhuri Udupi Bhojana No No 3.7 88
4 Grand Village No No 3.8 166

approx_cost(for two people) listed_in(type)

0 800 Buffet
1 800 Buffet
2 800 Buffet
3 300 Buffet
4 600 Buffet

pivot_table = dataframe.pivot_table(index='listed_in(type)',
columns='online_order', aggfunc='size', fill_value=0)
sns.heatmap(pivot_table, annot=True, cmap="YlGnBu",fmt='d')
plt.title("Heatmap")
plt.xlabel("Online Order")
plt.ylabel("Listed In (Type)")
plt.show()
Code Explain
1) Pivot table hamne create kara hain kyuki hame table hi banana hain iddhar aur variable ka
naam pivot_table rakha hai
2) columns required for this question -> listed type and online order

Zomoto Data Analysis Using Python
No ratings yet
Zomoto Data Analysis Using Python
10 pages
Zomoto Data Analysis Using Python - 1
No ratings yet
Zomoto Data Analysis Using Python - 1
10 pages
Zomato EDA
No ratings yet
Zomato EDA
8 pages
Zomato Rating Prediction
No ratings yet
Zomato Rating Prediction
11 pages
Code
No ratings yet
Code
17 pages
Documentation Final
No ratings yet
Documentation Final
53 pages
Data Exploration Summary
No ratings yet
Data Exploration Summary
3 pages
Real Estate Price Prediction Model
No ratings yet
Real Estate Price Prediction Model
33 pages
EDA Zomato 1681401606
No ratings yet
EDA Zomato 1681401606
15 pages
Sullurpeta Restaurant Menu Overview
No ratings yet
Sullurpeta Restaurant Menu Overview
417 pages
Swiggy Food Restaurant Analysis Using SQL: BY Akhilesh Maurya
No ratings yet
Swiggy Food Restaurant Analysis Using SQL: BY Akhilesh Maurya
12 pages
Zomato Analysis
No ratings yet
Zomato Analysis
4,000 pages
Restaurant DB
No ratings yet
Restaurant DB
12 pages
Bangalore Restaurant Insights
No ratings yet
Bangalore Restaurant Insights
3 pages
Customers
No ratings yet
Customers
3 pages
Minor Assignment
No ratings yet
Minor Assignment
34 pages
Hotel Management 1 Cs Project Class 12
No ratings yet
Hotel Management 1 Cs Project Class 12
17 pages
Restaurant Management System
No ratings yet
Restaurant Management System
18 pages
Outlier Detection with PySpark
No ratings yet
Outlier Detection with PySpark
1 page
F 12
No ratings yet
F 12
3 pages
PES University, Bangalore: UE21CS342AA2 - Data Analytics - Worksheet 4B
No ratings yet
PES University, Bangalore: UE21CS342AA2 - Data Analytics - Worksheet 4B
1 page
Housing Data Analysis Overview
No ratings yet
Housing Data Analysis Overview
2 pages
Zomato Dataset Analysis 1742311371
No ratings yet
Zomato Dataset Analysis 1742311371
16 pages
New Microsoft Word Document
No ratings yet
New Microsoft Word Document
18 pages
SQL Queries
No ratings yet
SQL Queries
2 pages
IP Practical PRGM
No ratings yet
IP Practical PRGM
41 pages
Data Analysis with Pandas Guide
No ratings yet
Data Analysis with Pandas Guide
40 pages
DA - Project 1
No ratings yet
DA - Project 1
12 pages
Final Project
No ratings yet
Final Project
17 pages
Foodhub Project Full Code .HTML
89% (9)
Foodhub Project Full Code .HTML
30 pages
Table
No ratings yet
Table
3 pages
Food Sold MBG 01 - 12 March 2024
No ratings yet
Food Sold MBG 01 - 12 March 2024
6 pages
Big Red Wraps
No ratings yet
Big Red Wraps
9 pages
Standardqp
No ratings yet
Standardqp
4 pages
Ank SMDM PDF
No ratings yet
Ank SMDM PDF
39 pages
Project Report1
No ratings yet
Project Report1
9 pages
Person and Favorite Food Data
No ratings yet
Person and Favorite Food Data
3 pages
Table
No ratings yet
Table
3 pages
Pandas - Jupyter Notebook - 19!7!2025
No ratings yet
Pandas - Jupyter Notebook - 19!7!2025
36 pages
Ops Lead Case Study
No ratings yet
Ops Lead Case Study
21 pages
Record Book Programs 2024-2025
No ratings yet
Record Book Programs 2024-2025
11 pages
Practical Ip (1) - 1
No ratings yet
Practical Ip (1) - 1
5 pages
Topic 4 Basic Programming Concepts
No ratings yet
Topic 4 Basic Programming Concepts
3 pages
Boston Housing Analysis
No ratings yet
Boston Housing Analysis
3 pages
SQL Project
No ratings yet
SQL Project
10 pages
Ip Practical
No ratings yet
Ip Practical
23 pages
Quantam - Learning - Colaboratory
No ratings yet
Quantam - Learning - Colaboratory
13 pages
Pandas Lec 2
No ratings yet
Pandas Lec 2
21 pages
Vantika Kamra's Practical File 12 Diamond (26600872)
No ratings yet
Vantika Kamra's Practical File 12 Diamond (26600872)
46 pages
Raman 2
No ratings yet
Raman 2
1 page
MTD Olive Group Report
No ratings yet
MTD Olive Group Report
140 pages
Patel - ML Lab Exercise 8
No ratings yet
Patel - ML Lab Exercise 8
10 pages
Market - Segment - Typcuenta de Market - Segment - Type Aviation 0.34% Complementary 1.08% Corporate 5.56% Offline 29.02% Online 63.99%
No ratings yet
Market - Segment - Typcuenta de Market - Segment - Type Aviation 0.34% Complementary 1.08% Corporate 5.56% Offline 29.02% Online 63.99%
2,274 pages
Zomato Bangalore Data Analysis Insights
No ratings yet
Zomato Bangalore Data Analysis Insights
11 pages
Cycle 2 Record IP
No ratings yet
Cycle 2 Record IP
13 pages
Hotel Management Database Schema
No ratings yet
Hotel Management Database Schema
4 pages
Morning Bean Sales Analysis
No ratings yet
Morning Bean Sales Analysis
7 pages
Statistics For Data Science
100% (3)
Statistics For Data Science
39 pages
UX Case Study22-1
No ratings yet
UX Case Study22-1
22 pages
MTech Conflict Management Guide
No ratings yet
MTech Conflict Management Guide
9 pages
Unit 3 Complete
No ratings yet
Unit 3 Complete
60 pages
Unit 6 JSPSingh
No ratings yet
Unit 6 JSPSingh
44 pages
Software Requirement Essentials
No ratings yet
Software Requirement Essentials
8 pages
Cloud Computing UNIT-I PPT - PPSX
No ratings yet
Cloud Computing UNIT-I PPT - PPSX
61 pages
Resource Access Control Guide
No ratings yet
Resource Access Control Guide
16 pages
Jmeter Interview Questions
No ratings yet
Jmeter Interview Questions
4 pages
Advances in Digital Forensics II PDF
100% (2)
Advances in Digital Forensics II PDF
357 pages
Sample Questions For Oracle 1z0 1085 25 Exam by Medina
No ratings yet
Sample Questions For Oracle 1z0 1085 25 Exam by Medina
9 pages
Himanshu's Resume
No ratings yet
Himanshu's Resume
2 pages
STUDENT RESULT MANAGEMENT SYSTEM Final
No ratings yet
STUDENT RESULT MANAGEMENT SYSTEM Final
46 pages
Genshin Impact Spiral Abyss Guide
No ratings yet
Genshin Impact Spiral Abyss Guide
1 page
Oracle Fleet Patching and Provisioning
No ratings yet
Oracle Fleet Patching and Provisioning
3 pages
Ethical Hacking Simulations
No ratings yet
Ethical Hacking Simulations
14 pages
Brian Svidergol - Bob Clements - Exam Ref MS-101 Microsoft 365 Mobility and Security-Microsoft Press (2019)
No ratings yet
Brian Svidergol - Bob Clements - Exam Ref MS-101 Microsoft 365 Mobility and Security-Microsoft Press (2019)
511 pages
PDF 45
No ratings yet
PDF 45
170 pages
Cloud-Native Security Practices in IBM Cloud: White Paper
No ratings yet
Cloud-Native Security Practices in IBM Cloud: White Paper
14 pages
AWS Interview Questions
100% (2)
AWS Interview Questions
58 pages
Thank-You For Downloading The Final Validation Report Template!
No ratings yet
Thank-You For Downloading The Final Validation Report Template!
3 pages
Azure NetApp Files: High-Performance Cloud Storage
No ratings yet
Azure NetApp Files: High-Performance Cloud Storage
5 pages
Doc02 - DCS Full-Stack Data Center Solution
No ratings yet
Doc02 - DCS Full-Stack Data Center Solution
4 pages
Computer Fundamentals Overview
No ratings yet
Computer Fundamentals Overview
6 pages
TCS Feedback
No ratings yet
TCS Feedback
6 pages
FormulaEncyclopedia (Minor Project)
No ratings yet
FormulaEncyclopedia (Minor Project)
24 pages
Library Management System Report
No ratings yet
Library Management System Report
100 pages
Internet Provider Packages and Prices
No ratings yet
Internet Provider Packages and Prices
27 pages
SEO Tactics for Google Ranking
100% (1)
SEO Tactics for Google Ranking
14 pages
RDBMS Concepts: © Tata Consultancy Services Ltd. July 7, 2018 1
No ratings yet
RDBMS Concepts: © Tata Consultancy Services Ltd. July 7, 2018 1
38 pages
3-2 Storage Data Protection Technologies and Applications
No ratings yet
3-2 Storage Data Protection Technologies and Applications
53 pages
Software Deployment Feedback Request
No ratings yet
Software Deployment Feedback Request
4 pages
Grade 9 REVISION (Ch6) (Ch7)
No ratings yet
Grade 9 REVISION (Ch6) (Ch7)
40 pages
Non-IT To IT Roadmap
No ratings yet
Non-IT To IT Roadmap
2 pages
Lecture Notes FDS Unit I
No ratings yet
Lecture Notes FDS Unit I
34 pages

Zomato Sales Analysis

Uploaded by

Zomato Sales Analysis

Uploaded by

import pandas as pd

dataframe = pd.read_csv("Zomato data .csv")

name online_order book_table rate votes \

approx_cost(for two people) listed_in(type)

[148 rows x 7 columns]

Now we will read this data into our jupyter notebook

name online_order book_table rate votes \

approx_cost(for two people) listed_in(type)

[148 rows x 7 columns]

Now we will be working on this data

name online_order book_table rate votes \

approx_cost(for two people) listed_in(type)

name online_order book_table rate votes \

approx_cost(for two people) listed_in(type)

[148 rows x 7 columns]

Now 1st Question

Q1. What Type Of Restaurant do the majority of Customers order from

name online_order book_table rate votes \

approx_cost(for two people) listed_in(type)

jab ham head lagate hain to starting ke 5 data dikhayega

Text(0.5, 0, 'type of restaurant')

now i'm going to explain this two lines of code

name online_order book_table rate votes \

approx_cost(for two people) listed_in(type)

Text(0, 0.5, 'votes')

1st) I created a variable by the name of grouped data

Conclusion --> Dinning Restaurants has received maximum votes

name online_order book_table rate votes \

approx_cost(for two people) listed_in(type)

bin=5 ( bar area show )

name online_order book_table rate votes \

approx_cost(for two people) listed_in(type)

couple_data=dataframe['approx_cost(for two people)']

<Axes: xlabel='approx_cost(for two people)', ylabel='count'>

Q5. Which mode ( Online or Offline) has received the maximum

name online_order book_table rate votes \

<Axes: xlabel='online_order', ylabel='rate'>

Q5. Which type of restaurant received more offline orders, so that

name online_order book_table rate votes \

approx_cost(for two people) listed_in(type)

You might also like