1/11/24, 9:37 PM Untitled10.
ipynb - Colaboratory
import pandas as pd
df = pd.read_csv("supermarketsales.csv")
print("First few rows of the dataset:")
print(df.head())
First few rows of the dataset:
Invoice ID Branch City Customer type Gender \
0 716-39-1409 B Mandalay Normal Male
1 704-48-3927 A Yangon Member Male
2 628-34-3388 C Naypyitaw Normal Male
3 630-74-5166 A Yangon Normal Male
4 588-01-7461 C Naypyitaw Normal Female
Product line Unit price Quantity Tax 5% Total amount Date \
0 Health and beauty 30.35 7 10.6225 223.07 April
1 Electronic accessories 88.67 10 44.3350 931.04 April
2 Fashion accessories 27.38 6 8.2140 172.49 April
3 Sports and travel 62.13 6 18.6390 391.42 April
4 Food and beverages 33.98 9 15.2910 321.11 April
Payment Rating
0 Cash 8.0
1 Ewallet 7.3
2 Credit card 7.9
3 Cash 7.4
4 Cash 4.2
print("Information about the dataset:")
print(df.info())
Information about the dataset:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 13 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Invoice ID 1000 non-null object
1 Branch 1000 non-null object
2 City 1000 non-null object
3 Customer type 1000 non-null object
4 Gender 1000 non-null object
5 Product line 1000 non-null object
6 Unit price 1000 non-null float64
7 Quantity 1000 non-null int64
8 Tax 5% 1000 non-null float64
9 Total amount 1000 non-null float64
10 Date 1000 non-null object
11 Payment 1000 non-null object
12 Rating 1000 non-null float64
dtypes: float64(4), int64(1), object(8)
memory usage: 101.7+ KB
None
print("Summary statistics of numerical columns:")
print(df.describe())
Summary statistics of numerical columns:
Unit price Quantity Tax 5% Total amount Rating
count 1000.000000 1000.000000 1000.000000 1000.000000 1000.00000
mean 55.672130 5.510000 15.379369 322.967430 6.97270
std 26.494628 2.923431 11.708825 245.885557 1.71858
min 10.080000 1.000000 0.508500 10.680000 4.00000
25% 32.875000 3.000000 5.924875 124.425000 5.50000
50% 55.230000 5.000000 12.088000 253.850000 7.00000
75% 77.935000 8.000000 22.445250 471.350000 8.50000
max 99.960000 10.000000 49.650000 1042.650000 10.00000
https://colab.research.google.com/drive/1wPr-x9-kma0doO2UknDmqGOj9sQ-15ZM#scrollTo=GEGiyS_jMHM2&printMode=true 1/2
1/11/24, 9:37 PM Untitled10.ipynb - Colaboratory
print("Data types of each column:")
print(df.dtypes)
Data types of each column:
Invoice ID object
Branch object
City object
Customer type object
Gender object
Product line object
Unit price float64
Quantity int64
Tax 5% float64
Total amount float64
Date object
Payment object
Rating float64
dtype: object
https://colab.research.google.com/drive/1wPr-x9-kma0doO2UknDmqGOj9sQ-15ZM#scrollTo=GEGiyS_jMHM2&printMode=true 2/2