0% found this document useful (0 votes)

52 views17 pages

Exploratory Data Analysis66

The document analyzes customer data from a shop including gender, age, income, spending score, profession, work experience, and family size. Various visualizations are created to segment and analyze customer attributes and their relationship to spending score, including count plots, histograms, line plots, hexbin plots, bar plots, box plots, and violin plots. Insights on customer demographics, income distribution, spending behavior by subgroup are explored.

Uploaded by

Rishi Sahu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views17 pages

Exploratory Data Analysis66

Uploaded by

Rishi Sahu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

shop-customer-data-analysis

March 16, 2023

[1]: #importing necessary libraries

import numpy as np
import pandas as pd
import [Link] as plt
import seaborn as sns

[2]: #loading the dataset

df = pd.read_csv("/kaggle/input/customers-dataset/[Link]")

[3]: #extracting first-five rows

[Link]()

[3]: CustomerID Gender Age Annual Income ($) Spending Score (1-100) \
0 1 Male 19 15000 39
1 2 Male 21 35000 81
2 3 Female 20 86000 6
3 4 Female 23 59000 77
4 5 Female 31 38000 40

Profession Work Experience Family Size

0 Healthcare 1 4
1 Engineer 3 3
2 Engineer 1 1
3 Lawyer 0 2
4 Entertainment 2 6

[4]: #extracting last-five rows

[Link]()

[4]: CustomerID Gender Age Annual Income ($) Spending Score (1-100) \
1995 1996 Female 71 184387 40
1996 1997 Female 91 73158 32
1997 1998 Male 87 90961 14
1998 1999 Male 77 182109 4
1999 2000 Male 90 110610 52

Profession Work Experience Family Size

1995 Artist 8 7

1
1996 Doctor 7 7
1997 Healthcare 9 2
1998 Executive 7 2
1999 Entertainment 5 2

[5]: #determining the shape

[Link]

[5]: (2000, 8)

[6]: #determining the size

[Link]

[6]: 16000

[7]: #checking the null values

[Link]().sum()

[7]: CustomerID 0
Gender 0
Age 0
Annual Income ($) 0
Spending Score (1-100) 0
Profession 35
Work Experience 0
Family Size 0
dtype: int64

[8]: #determining mode of 'Profession' column

df["Profession"].mode()

[8]: 0 Artist
dtype: object

[9]: #replacing null values with mode

df["Profession"].fillna("Artist", inplace=True)

[10]: # checking the duplicates

[Link]().value_counts()

[10]: False 2000

dtype: int64

[11]: #checking the information

[Link]()

<class '[Link]'>
RangeIndex: 2000 entries, 0 to 1999
Data columns (total 8 columns):

2
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 CustomerID 2000 non-null int64
1 Gender 2000 non-null object
2 Age 2000 non-null int64
3 Annual Income ($) 2000 non-null int64
4 Spending Score (1-100) 2000 non-null int64
5 Profession 2000 non-null object
6 Work Experience 2000 non-null int64
7 Family Size 2000 non-null int64
dtypes: int64(6), object(2)
memory usage: 125.1+ KB

[12]: #extracting statistical summary

[Link]()

[12]: CustomerID Age Annual Income ($) Spending Score (1-100) \

count 2000.000000 2000.000000 2000.000000 2000.000000
mean 1000.500000 48.960000 110731.821500 50.962500
std 577.494589 28.429747 45739.536688 27.934661
min 1.000000 0.000000 0.000000 0.000000
25% 500.750000 25.000000 74572.000000 28.000000
50% 1000.500000 48.000000 110045.000000 50.000000
75% 1500.250000 73.000000 149092.750000 75.000000
max 2000.000000 99.000000 189974.000000 100.000000

Work Experience Family Size

count 2000.000000 2000.000000
mean 4.102500 3.768500
std 3.922204 1.970749
min 0.000000 1.000000
25% 1.000000 2.000000
50% 3.000000 4.000000
75% 7.000000 5.000000
max 17.000000 9.000000

[13]: #creating the pairplot

[Link]([Link]("CustomerID", axis=1))

[13]: <[Link] at 0x7f21431e3c90>

3
[14]: # segment customers by gender
[Link](x='Gender', data=df)
[Link]('Customer Gender Distribution')
[Link]()

4
[28]: # segment customers by age
[Link](x='Age', data=df, color='purple', bins=20)
[Link]('Customer Age Distribution')
[Link]()

5
[29]: # segment by income
[Link](x='Annual Income ($)', data=df, color="green", fill=True)
[Link]('Income Distribution')
[Link]()

6
[17]: # segment customers by profession
[Link](x='Profession', data=df)
[Link](rotation=45)
[Link]('Customer Profession Distribution')
[Link]()

7
[30]: # segment customers by work experience
[Link](x='Work Experience', data=df, color='red', fill=True)
[Link]('Work Experience Distribution')
[Link]()

8
[19]: # segment customers by family size
[Link](x='Family Size', data=df)
[Link]('Customer Family Size Distribution')
[Link]()

9
[20]: # spending score by gender
[Link](x='Gender', y='Spending Score (1-100)', data=df)
[Link]('Spending Score by Gender')
[Link]()

10
[31]: # spending behavior by age
[Link](x='Age', y='Spending Score (1-100)', color="orange", data=df)
[Link]('Spending Score by Age')
[Link]()

11
[22]: # analyze spending behavior by age and gender
[Link](x='Age', y='Spending Score (1-100)', hue='Gender', data=df)
[Link]('Spending Score by Age and Gender')
[Link]()

12
[23]: # spending behavior by income
[Link](x='Annual Income ($)', y='Spending Score (1-100)', data=df,␣
↪gridsize=20, cmap='Blues')

[Link]('Annual Income ($)')

[Link](rotation=45)
[Link]('Spending Score (1-100)')
[Link]('Spending Score by Income')
[Link]()
[Link]()

13
[24]: # spending behavior by profession
[Link](x='Profession', y='Spending Score (1-100)', data=df)
[Link](rotation=45)
[Link]('Spending Score by Profession')
[Link]()

14
[32]: # spending behavior by work experience
[Link](x='Work Experience', y='Spending Score (1-100)', data=df)
[Link]('Spending Score by Work Experience')
[Link]()

15
[26]: # spending behavior by family size
[Link](x='Family Size', y='Spending Score (1-100)', data=df)
[Link]('Spending Score by Family Size')
[Link]()

16
[ ]:

K Means Clustering For Customer Data
No ratings yet
K Means Clustering For Customer Data
6 pages
Mall Customer Data Analysis PDF
No ratings yet
Mall Customer Data Analysis PDF
10 pages
Retail Sales Customer Analysis
No ratings yet
Retail Sales Customer Analysis
10 pages
Supermarket Sales Insights
No ratings yet
Supermarket Sales Insights
8 pages
Axe Submission
No ratings yet
Axe Submission
4 pages
Diwali Sales Analysis EDA 1696347982
No ratings yet
Diwali Sales Analysis EDA 1696347982
8 pages
Diwali Sales Data Analysis in Python
No ratings yet
Diwali Sales Data Analysis in Python
8 pages
ML Assignment No 5
No ratings yet
ML Assignment No 5
11 pages
Reading Data: #Importing Required Libraries
No ratings yet
Reading Data: #Importing Required Libraries
16 pages
Ads Phase 5
No ratings yet
Ads Phase 5
23 pages
Exp 12 and 15
No ratings yet
Exp 12 and 15
4 pages
Document 11
No ratings yet
Document 11
6 pages
EDA Diwali Sale Analysis Project
No ratings yet
EDA Diwali Sale Analysis Project
11 pages
Customer Clustering Analysis
No ratings yet
Customer Clustering Analysis
22 pages
BIDA Practical Print
No ratings yet
BIDA Practical Print
56 pages
Customer Segmentation Analysis
No ratings yet
Customer Segmentation Analysis
18 pages
Diwali Sales Analysis
No ratings yet
Diwali Sales Analysis
14 pages
KMEANS
No ratings yet
KMEANS
13 pages
Technologyname Phase2
No ratings yet
Technologyname Phase2
20 pages
Walmart - A Case Study
No ratings yet
Walmart - A Case Study
51 pages
Case Study Module 1
No ratings yet
Case Study Module 1
4 pages
Tasks For Students
No ratings yet
Tasks For Students
4 pages
Data Visualization Lab: Experiment 1
No ratings yet
Data Visualization Lab: Experiment 1
8 pages
K-Means for Customer Segmentation
No ratings yet
K-Means for Customer Segmentation
13 pages
Customer Retail Shopping Analysis 1686591558
No ratings yet
Customer Retail Shopping Analysis 1686591558
45 pages
Customer Segmentation Analysis
No ratings yet
Customer Segmentation Analysis
3 pages
Supermarket Sales Analysis 1
No ratings yet
Supermarket Sales Analysis 1
13 pages
Analyzing Supermarket Sales Data
No ratings yet
Analyzing Supermarket Sales Data
6 pages
Walmart Sales Data Analysis Insights
No ratings yet
Walmart Sales Data Analysis Insights
18 pages
West Rox
No ratings yet
West Rox
29 pages
ALOJIPAN Assessment - Task - 1 - Sampling - Data - Visualization
No ratings yet
ALOJIPAN Assessment - Task - 1 - Sampling - Data - Visualization
12 pages
Code
No ratings yet
Code
5 pages
Another Project-Creating Customer Segments
No ratings yet
Another Project-Creating Customer Segments
31 pages
Aim: Objective
No ratings yet
Aim: Objective
7 pages
Final Ca
No ratings yet
Final Ca
10 pages
Data Science
No ratings yet
Data Science
22 pages
Customer Segmentation in Python
No ratings yet
Customer Segmentation in Python
71 pages
Data Analysis Project On Customer Purchases Dataset
No ratings yet
Data Analysis Project On Customer Purchases Dataset
1 page
K Means Clustering Customer Clustering
No ratings yet
K Means Clustering Customer Clustering
7 pages
Data Exploration with Python on Kaggle
No ratings yet
Data Exploration with Python on Kaggle
20 pages
Assignmnet 5
No ratings yet
Assignmnet 5
11 pages
Raw Customer Analysis
No ratings yet
Raw Customer Analysis
2 pages
Data Preparation Guide
No ratings yet
Data Preparation Guide
6 pages
Data Analysis Guide for Beginners
No ratings yet
Data Analysis Guide for Beginners
26 pages
Data Visualization: Types of Data Visualization: Charts and Graphs Line Charts
No ratings yet
Data Visualization: Types of Data Visualization: Charts and Graphs Line Charts
15 pages
Customer Segmentation Analysis
No ratings yet
Customer Segmentation Analysis
34 pages
Data Analysis for Banking Insights
No ratings yet
Data Analysis for Banking Insights
2 pages
Guides
No ratings yet
Guides
23 pages
Unit 3-5 15 Marks
No ratings yet
Unit 3-5 15 Marks
8 pages
NN Model and Gap Statistic Analysis
80% (10)
NN Model and Gap Statistic Analysis
14 pages
CH2 Descriptive Analytics QA PDF
No ratings yet
CH2 Descriptive Analytics QA PDF
25 pages
All Analysiscode Explanation
No ratings yet
All Analysiscode Explanation
22 pages
Case Study 3 Aman
No ratings yet
Case Study 3 Aman
9 pages
Seaborn Color Palette Usage Guide
No ratings yet
Seaborn Color Palette Usage Guide
67 pages
Xii STD Practical 1 (1) 1
No ratings yet
Xii STD Practical 1 (1) 1
22 pages
Prac 2
No ratings yet
Prac 2
11 pages
Data Science Project VI - Ipynb - Colaboratory
No ratings yet
Data Science Project VI - Ipynb - Colaboratory
15 pages
Wa0002.
No ratings yet
Wa0002.
4 pages
Assignment ....
No ratings yet
Assignment ....
8 pages
Transcript Asus
No ratings yet
Transcript Asus
3 pages
Rishabh Sahu Resume
No ratings yet
Rishabh Sahu Resume
1 page
P - Tax Snippet - P
No ratings yet
P - Tax Snippet - P
1 page
Bowie PDF
No ratings yet
Bowie PDF
15 pages
Getting Started in Excel 365 (2023)
No ratings yet
Getting Started in Excel 365 (2023)
1 page
Lecture 26
No ratings yet
Lecture 26
4 pages
Aviation Internship Agreement 2024
No ratings yet
Aviation Internship Agreement 2024
3 pages
Test Bank for Strategic Brand Management Building Measuring and Managing Brand Equity 4th by Keller
No ratings yet
Test Bank for Strategic Brand Management Building Measuring and Managing Brand Equity 4th by Keller
323 pages
Infrared Remote Control:: Block Diagram of An IR Remote Control Switch
100% (1)
Infrared Remote Control:: Block Diagram of An IR Remote Control Switch
4 pages
4 We Iot in Der Praxis
No ratings yet
4 We Iot in Der Praxis
42 pages
Job Fit and Recruitment Strategies
No ratings yet
Job Fit and Recruitment Strategies
5 pages
Income-Leisure Trade-Off Model Analysis
No ratings yet
Income-Leisure Trade-Off Model Analysis
23 pages
FANUC Robot Models & Specs Guide
75% (4)
FANUC Robot Models & Specs Guide
66 pages
Standard CP Forms
No ratings yet
Standard CP Forms
4 pages
Operation Manual Steyr Edition 6 2016 PDF
No ratings yet
Operation Manual Steyr Edition 6 2016 PDF
134 pages
Wholesale in Mysore
No ratings yet
Wholesale in Mysore
14 pages
LEOSA (2law Enforcement Officers Safety Act Frequently Asked Questions
No ratings yet
LEOSA (2law Enforcement Officers Safety Act Frequently Asked Questions
6 pages
Governance and Ethics: Jocelyn Damon Jdamon@iie - Ac.za
No ratings yet
Governance and Ethics: Jocelyn Damon Jdamon@iie - Ac.za
46 pages
Memorial For Respondent UIC 03
No ratings yet
Memorial For Respondent UIC 03
21 pages
Client Visit Report: Plot Sales Update
No ratings yet
Client Visit Report: Plot Sales Update
24 pages
Taco Bell Management Principles Report
No ratings yet
Taco Bell Management Principles Report
11 pages
PMI ACP V4 - KnowledgeHut PDF
100% (3)
PMI ACP V4 - KnowledgeHut PDF
228 pages
Year 3 Maths Homework Sheets
33% (3)
Year 3 Maths Homework Sheets
8 pages
Using The GTC Accommodation Portal (2022)
No ratings yet
Using The GTC Accommodation Portal (2022)
14 pages
TQM Course PACK Spring 2011
No ratings yet
TQM Course PACK Spring 2011
9 pages
Amplify Instruction Sheet
No ratings yet
Amplify Instruction Sheet
5 pages
Module 2 Manpower Leveling
100% (1)
Module 2 Manpower Leveling
24 pages
Construction Engineering Management 1
No ratings yet
Construction Engineering Management 1
11 pages
Eg-005 Cleaning & Replacement of Filters Mounted On Compressed Air Line
No ratings yet
Eg-005 Cleaning & Replacement of Filters Mounted On Compressed Air Line
3 pages
Serious Incident Framework Serious Incident Framework
No ratings yet
Serious Incident Framework Serious Incident Framework
90 pages
Lokpal and Lokayukta - UPSC Notes
No ratings yet
Lokpal and Lokayukta - UPSC Notes
6 pages
ICDF2025: Digital Forensics Conference Invite
No ratings yet
ICDF2025: Digital Forensics Conference Invite
1 page
Alex Back Office Update Tracker - 2024
No ratings yet
Alex Back Office Update Tracker - 2024
25 pages
Graffiti Tag Illustrator Tutorial
No ratings yet
Graffiti Tag Illustrator Tutorial
3 pages
Saes P 101
No ratings yet
Saes P 101
9 pages

Exploratory Data Analysis66

Uploaded by

Exploratory Data Analysis66

Uploaded by

shop-customer-data-analysis

March 16, 2023

[1]: #importing necessary libraries

[2]: #loading the dataset

[3]: #extracting first-five rows

Profession Work Experience Family Size

[4]: #extracting last-five rows

Profession Work Experience Family Size

[5]: #determining the shape

[6]: #determining the size

[7]: #checking the null values

[8]: #determining mode of 'Profession' column

[9]: #replacing null values with mode

[10]: # checking the duplicates

[10]: False 2000

[11]: #checking the information

[12]: #extracting statistical summary

[12]: CustomerID Age Annual Income ($) Spending Score (1-100) \

Work Experience Family Size

[13]: #creating the pairplot

[13]: <[Link] at 0x7f21431e3c90>

[Link]('Annual Income ($)')

You might also like