0% found this document useful (0 votes)
10 views14 pages

Data Analysis Report - Colaboratory

Uploaded by

Chayan Patidar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views14 pages

Data Analysis Report - Colaboratory

Uploaded by

Chayan Patidar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

12/23/23, 10:01 PM Data analysis report - Colaboratory

keyboard_arrow_down Q1: Sales Analysis


import pandas as pd
import numpy as np

item_category = pd.read_csv("/content/[Link]")
sales = pd.read_csv("/content/[Link]")
wholesale = pd.read_csv("/content/[Link]")
loss_rate = pd.read_csv("/content/[Link]")

item_category.head()

Item Code Item Name Category Code Category Name

0 102900005115168 Niushou Shengcai 1011010101 Flower/Leaf Vegetables

1 102900005115199 Sichuan Red Cedar 1011010101 Flower/Leaf Vegetables

2 102900005115625 Local Xiaomao Cabbage 1011010101 Flower/Leaf Vegetables

3 102900005115748 White Caitai 1011010101 Flower/Leaf Vegetables

4 102900005115762 Amaranth 1011010101 Flower/Leaf Vegetables

[Link]()

Unit
Quantity
Selling Sale or Discount
Date Time Item Code Sold
Price Return (Yes/No)
(kilo)
(RMB/kg)

2020-
0 [Link].924 1.029000e+14 0.396 7.6 sale No
07-01

2020-
1 [Link].295 1.029000e+14 0.849 3.2 sale No
07-01

2020-

[Link]()

Date Item Code Wholesale Price (RMB/kg)

0 2020-07-01 102900005115762 3.88

1 2020-07-01 102900005115779 6.72

2 2020-07-01 102900005115786 3.19

3 2020-07-01 102900005115793 9.24

4 2020-07-01 102900005115823 7.03

loss_rate.head()

Item Code Item Name Loss Rate (%)

0 102900005115168 Niushou Shengcai 4.39

1 102900005115199 Sichuan Red Cedar 10.46

2 102900005115250 Xixia Black Mushroom (1) 10.80

3 102900005115625 Local Xiaomao Cabbage 0.18

4 102900005115748 White Caitai 8.78

sales_wholesale_combine_data = [Link](sales,wholesale,how="left",on=["Item Code","Date"])


sales_wholesale_combine_data.head()

[Link] 1/14
12/23/23, 10:01 PM Data analysis report - Colaboratory

Unit
Quantity Sale Wholesale
Selling Discount
Date Time Item Code Sold or Price
Price (Yes/No)
(kilo) Return (RMB/kg)
(RMB/kg)

2020-
0 [Link].924 1.029000e+14 0.396 7.6 sale No 4.32
07-01
sales_wholesale_combine_data.isnull().sum()
2020-
1 [Link].295 1.029000e+14 0.849 3.2 sale No 2.10
Date07-01 0
Time 0
Item Code 1
Quantity Sold (kilo) 1
Unit Selling Price (RMB/kg) 1
Sale or Return 1
Discount (Yes/No) 1
Wholesale Price (RMB/kg) 1
dtype: int64

sales_wholesale_category = [Link](sales_wholesale_combine_data,item_category,how="left",on="Item Code")


sales_wholesale_category.head()

Unit
Quantity Sale Wholesale
Selling Discount
Date Time Item Code Sold or Price I
Price (Yes/No)
(kilo) Return (RMB/kg)
(RMB/kg)

2020- P
0 [Link].924 1.029000e+14 0.396 7.6 sale No 4.32
07-01

2020-
1 [Link].295 1.029000e+14 0.849 3.2 sale No 2.10
07-01

2020- P
2 [Link].905 1.029000e+14 0.409 7.6 sale No 4.32
07-01

2020-
3 [Link].450 1.029000e+14 0.421 10.0 sale No 7.03 Sha
07-01

final_data = [Link](sales_wholesale_category,loss_rate,how="left",on=["Item Code","Item Name"])


final_data.head()

Unit
Quantity Sale Wholesale
Selling Discount
Date Time Item Code Sold or Price I
Price (Yes/No)
(kilo) Return (RMB/kg)
(RMB/kg)

2020- P
0 [Link].924 1.029000e+14 0.396 7.6 sale No 4.32
07-01

2020-
1 [Link].295 1.029000e+14 0.849 3.2 sale No 2.10
07-01

2020- P
2 [Link].905 1.029000e+14 0.409 7.6 sale No 4.32
07-01

2020-
3 [Link].450 1.029000e+14 0.421 10.0 sale No 7.03 Sha
07-01

final_data["total_sales"]=final_data["Quantity Sold (kilo)"]*final_data["Unit Selling Price (RMB/kg)"]

final_data.head()

[Link] 2/14
12/23/23, 10:01 PM Data analysis report - Colaboratory

Unit
Quantity Sale Wholesale
Selling Discount
Date Time Item Code Sold or Price I
Price (Yes/No)
final_data.isnull().sum() (kilo) Return (RMB/kg)
(RMB/kg)
Date2020- 0 P
0
Time [Link].924 1.029000e+14
0 0.396 7.6 sale No 4.32
07-01
Item Code 1
Quantity
2020- Sold (kilo) 1
1
Unit [Link].295
Selling 1.029000e+14
Price (RMB/kg) 1 0.849 3.2 sale No 2.10
07-01
Sale or Return 1
2020- (Yes/No)
Discount 1 P
2 [Link].905 1.029000e+14 0.409 7.6 sale No 4.32
07-01 Price (RMB/kg)
Wholesale 1
Item Name 1
2020- Code
Category 1
3 [Link].450 1.029000e+14 0.421 10.0 sale No 7.03 Sha
07-01 Name
Category 1
Loss Rate (%) 1
total_sales 1
dtype: int64

final_data.duplicated().sum()

final_data.dtypes

Date object
Time object
Item Code float64
Quantity Sold (kilo) float64
Unit Selling Price (RMB/kg) float64
Sale or Return object
Discount (Yes/No) object
Wholesale Price (RMB/kg) float64
Item Name object
Category Code float64
Category Name object
Loss Rate (%) float64
total_sales float64
dtype: object

final_data.describe().astype(int)

Quantity Sold Unit Selling Price Wholesale Price Category Loss Rate
Item Code total_sales
(kilo) (RMB/kg) (RMB/kg) Code (%)

count 35319 35319 35319 35319 35319 35319 35319

mean 102911148089094 0 10 6 1011010348 10 3

std 212277205963 0 4 2 258 4 2

min 102900005115762 -2 0 1 1011010101 2 -13

25% 102900005115984 0 7 4 1011010101 7 2

50% 102900005116714 0 9 5 1011010201 10 3

75% 102900005125808 0 12 6 1011010504 13 4

max 106956146480203 4 39 29 1011010801 29 32

for i in final_data.columns:
print(i)
print(final_data[i].unique())
print("************************************************************************************")

[Link] 3/14
12/23/23, 10:01 PM Data analysis report - Colaboratory
9.98 2.79 8.63 4.62 9.82 11.61 3.62 9.18 9.79 5.2 5.54 10.82
4.67 11.56 3.11 9.95 3.55 11.16 11.27 3.63 3.34 4.66 7.08 3.84
9.72 11.82 7.73 10.59 11.52 3.06 9.44 3.43 7.63 5.81 10.89 5.1
4.16 4.89 6.65 9.76 10.49 9.89 10.52 5.36 6.67 15.35 7.4 9.6
5.72 6.39 2.12 7.14 11.08 4.53 9.26 6.5 9.43 5.17 16.34 4.79
10.33 5.57 10.87 3.41 6.74 4.51 5.32 12.16 5.09 4.12 2.06 4.82
8.43 11.07 9.34 15.72 7.11 10.26 3.69 7.54 7.22 12.6 7.46 9.28
5.12 15.31 11.57 10.34 14.37 4.99 6.55 9.22 3.68 12.62 7.6 6.44
4.02 10.81 4.14 6.33 11. 5.95 5.74 9.21 3.71 12.68 11.2 nan]
************************************************************************************
Item Name
['Paopaojiao (Jingpin)' 'Chinese Cabbage' 'Shanghaiqing' 'Caixin'
'Yunnan Shengcai' 'Sweet Chinese Cabbage' 'High Melon (1)'
'Yunnan Lettuces' 'Xixia Mushroom (1)' 'Green Hot Peppers'
'Red Pepper (1)' 'Amaranth' 'Broccoli' 'Spinach' 'Qinggengsanhua'
'7 Colour Pepper (1)' 'The Red Bell Pepper (1)' 'Green Line Pepper'
'Needle Mushroom (1)' 'Honghu Lotus Root' 'Hongshujian'
'Apricot Bao Mushroom (1)' 'Zhuyecai' 'Huangbaicai (2)'
'Green Eggplant (1)' 'Red Hang Pepper' 'Eggplant (2)' 'Millet Pepper'
'Haixian Mushroom (1)' 'Foreign Garland Chrysanthemum ' 'Bell Pepper (1)'
'Muercai' 'Wawacai' 'Jigu Mushroom (1)' 'Yellow Xincai (1)'
'Luosi Pepper' 'Dongmenkou Xiaobaicai' 'Dalong Eggplant'
'Needle Mushroom (Bag) (1)' 'Ping Mushroom' 'Net Lotus Root (1)'
'Nanguajian' 'The Crab Flavor Mushroom (2)' 'Niushou Youcai'
'The White Mushroom (2)' 'Haixian Mushroom (Bag) (1)'
'Jigu Mushroom (Bunch)' 'Lotus (Ea)' 'Red Hot Peppers' nan]
************************************************************************************
Category Code
[1.0110105e+09 1.0110101e+09 1.0110104e+09 1.0110108e+09 1.0110102e+09
1.0110105e+09 nan]
************************************************************************************
Category Name
['Capsicum' 'Flower/Leaf\xa0Vegetables' 'Aquatic Tuberous Vegetables'
'Edible Mushroom' 'Cabbage' 'Solanum' nan]
************************************************************************************
Loss Rate (%)
[ 7.08 22.27 14.43 13.7 15.25 9.43 29.25 12.81 13.82 6.72 11.76 18.52
9.26 18.51 17.06 15.98 8.93 7.8 3.43 24.05 8.42 5.05 13.62 15.61
5.01 9.99 6.07 5.86 9.89 26.16 7.59 16.33 7.61 2.48 8.99 10.64
10.18 27.84 10.94 8.85 11.6 5.54 13.46 16.04 12.17 7.3 5.96 12.42
6.73 nan]
************************************************************************************
total_sales
[3.0096 2.7168 3.1084 ... 3.003 7.018 nan]
************************************************************************************

final_data.head()

Unit
Quantity Sale Wholesale Los
Selling Discount Category
Date Time Item Code Sold or Price Item Name Category Name Rat
Price (Yes/No) Code
(kilo) Return (RMB/kg) (%
(RMB/kg)

2020- Paopaojiao
0 [Link].924 1.029000e+14 0.396 7.6 sale No 4.32 1.011011e+09 Capsicum 7.0
07-01 (Jingpin)

2020- Chinese
1 [Link].295 1.029000e+14 0.849 3.2 sale No 2.10 1.011010e+09 Flower/Leaf Vegetables 22.2
07-01 Cabbage

2020- Paopaojiao
2 [Link].905 1.029000e+14 0.409 7.6 sale No 4.32 1.011011e+09 Capsicum 7.0
07-01 (Jingpin)

2020-
3 [Link].450 1.029000e+14 0.421 10.0 sale No 7.03 Shanghaiqing 1.011010e+09 Flower/Leaf Vegetables 14.4
07-01

final_data["Item Name"].nunique()
final_data["Category Name"].nunique()

category_name_wise_sales = final_data.groupby(["Category Name"])["total_sales"].sum().reset_index()


category_name_wise_sales["total_sales"]=category_name_wise_sales["total_sales"].astype(int)

category_name_wise_sales

[Link] 4/14
12/23/23, 10:01 PM Data analysis report - Colaboratory

Category Name total_sales

0 Aquatic Tuberous Vegetables 5000

1 Cabbage 17388

2 Capsicum 24346

3 Edible Mushroom 20373

[Link] as plt
Flower/Leaf Vegetables 51728
import seaborn as sns
5 Solanum 10048
[Link](x="Category Name",y="total_sales",data=category_name_wise_sales)
[Link]("Category Name")
[Link]("Total Sales")
[Link]("Category Wise Sales")
[Link]()

[Link] 5/14
12/23/23, 10:01 PM Data analysis report - Colaboratory

import pandas as pd
import numpy as np
import [Link] as plt
import seaborn as sns

item_category = pd.read_csv("/content/[Link]")
sales = pd.read_csv("/content/[Link]")
wholesale = pd.read_csv("/content/[Link]")
loss_rate = pd.read_csv("/content/[Link]")
item_category.head()
[Link]()
[Link]()
loss_rate.head()
sales_wholesale_combine_data = [Link](sales,wholesale,how="left",on=["Item Code","Date"])
sales_wholesale_combine_data.head()
sales_wholesale_combine_data.isnull().sum()
sales_wholesale_category = [Link](sales_wholesale_combine_data,item_category,how="left",on="Item Code")
sales_wholesale_category.head()
final_data = [Link](sales_wholesale_category,loss_rate,how="left",on=["Item Code","Item Name"])
final_data.head()
final_data["total_sales"]=final_data["Quantity Sold (kilo)"]*final_data["Unit Selling Price (RMB/kg)"]
final_data.head()
final_data.isnull().sum()
final_data.duplicated().sum()
final_data.dtypes
final_data.describe().astype(int)
for i in final_data.columns:
print(i)
print(final_data[i].unique())
print("************************************************************************************")

final_data.head()
final_data["Item Name"].nunique()
final_data["Category Name"].nunique()
category_name_wise_sales = final_data.groupby(["Category Name"])["total_sales"].sum().reset_index()
category_name_wise_sales["total_sales"]=category_name_wise_sales["total_sales"].astype(int)
category_name_wise_sales

[Link](x="Category Name",y="total_sales",data=category_name_wise_sales)
[Link]("Category Name")
[Link]("Total Sales")
[Link]("Category Wise Sales")
[Link]()

# Pie chart
category_name_wise_sales.plot(kind='pie', y='total_sales', autopct='%1.1f%%', shadow=True, startangle=140)
[Link]('Category Wise Sales')
[Link]()

[Link] 6/14
12/23/23, 10:01 PM Data analysis report - Colaboratory

Date
['2020-07-01' '2020-07-02' '2020-07-03' '2020-07-04' '2020-07-05'
'2020-07-06' '2020-07-07' '2020-07-08' '2020-07-09' '2020-07-10'
'2020-07-11' '2020-07-12' '2020-07-13' '2020-07-14' '2020-07-15'
'2020-07-16' '2020-07-17' '2020-07-18' '2020-07-19' '2020-07-20'
'2020-07-21' '2020-07-22' '2020-07-23' '2020-07-24' '2020-07-25'
'2020-07-26' '2020-07-27' '2020-07-28' '2020-07-29' '2020-07-30'
'2020-07-31' '2020-08-01' '2020-08-02']
************************************************************************************
Time
['[Link].924' '[Link].295' '[Link].905' ... '[Link].623'
'[Link].092' '11']
************************************************************************************
Item Code
[1.02900005e+14 1.02900005e+14 1.02900005e+14 1.02900005e+14
1.02900005e+14 1.02900011e+14 1.02900005e+14 1.02900005e+14
1.02900005e+14 1.02900005e+14 1.02900005e+14 1.02900005e+14
1.02900005e+14 1.02900005e+14 1.02900011e+14 1.02900005e+14
1.02900011e+14 1.02900051e+14 1.02900005e+14 1.02900051e+14
1.02900005e+14 1.02900005e+14 1.02900005e+14 1.02900051e+14
1.02900005e+14 1.02900005e+14 1.02900005e+14 1.02900005e+14
1.02900005e+14 1.02900011e+14 1.02900005e+14 1.02900005e+14
1.02900005e+14 1.02900005e+14 1.02900005e+14 1.02900005e+14
1.02900011e+14 1.02900011e+14 1.02900011e+14 1.02900011e+14
1.02900005e+14 1.02900005e+14 1.02900005e+14 1.06956146e+14
1.02900005e+14 1.06956146e+14 1.02900011e+14 1.02900011e+14
1.02900011e+14 1.02900005e+14 nan]
************************************************************************************
Quantity Sold (kilo)
[0.396 0.849 0.409 ... 0.583 0.974 nan]
************************************************************************************
Unit Selling Price (RMB/kg)
[ 7.6 3.2 10. 8. 6. 18. 14. 9. 16. 39.8 4. 25.8 12. 7.
6.9 6.4 0.7 19.8 3.8 3.4 6.1 6.5 6.3 37.6 3.6 5.9 13.2 5.6
8.4 17.8 11. 6.8 7.8 9.7 3.5 13.6 11.7 5.8 6.6 17. 15.6 13.
5.7 9.3 3.1 6.2 5.3 4.8 6.7 5. 0.3 18.7 9.5 15.1 9.1 5.5
4.5 7.2 10.8 18.1 7.5 9.2 14.7 7.3 11.8 13.5 16.8 23.8 14.6 18.9
5.2 2.4 13.1 7.4 8.6 11.3 14.4 4.7 12.6 23.2 15.2 24.5 11.2 14.9
7.9 5.4 3.7 18.5 10.1 5.1 18.4 9.4 10.7 7.1 17.6 4.2 8.5 8.9
10.3 10.9 22.7 12.2 16.9 8.2 10.5 4.6 15.4 0.5 1.7 9.6 4.3 8.8
10.6 19.1 11.6 11.4 15.3 11.9 12.7 12.1 2.8 7.7 22.4 16.3 29.8 12.9
15. 16.4 11.5 11.1 0.9 25.6 18.8 17.3 4.1 15.8 13.3 23.3 8.7 0.4
13.9 1.5 22.2 17.1 8.3 2.9 17.9 18.3 8.1 17.2 12.5 3. 2.3 2.7
21.8 10.4 14.8 16.5 12.8 3.9 21.9 21.6 17.7 19. 2.2 10.2 3.3 2.5
12.3 1.2 18.2 16.2 2.6 2. 12.4 1.9 13.4 4.4 0.8 17.5 1.8 9.8
14.5 4.9 13.7 16.7 18.6 2.1 1.4 1.6 17.4 23.4 14.3 19.3 1. 31.8
16.6 14.1 15.7 0.2 0.1 13.8 1.1 1.3 22.9 23.7 nan]
************************************************************************************
Sale or Return
['sale' 'return' nan]
************************************************************************************
Discount (Yes/No)
['No' 'Yes' nan]
************************************************************************************
Wholesale Price (RMB/kg)
[ 4.32 2.1 7.03 4.6 6.72 4.44 5.65 3.44 10.8 4.64 5.76 3.88
9.23 8.47 6.03 7.58 8.24 6.81 4.06 29.43 3.6 6.16 3.19 4.63
1.63 7.83 6.56 12.1 10.29 11.69 9.24 7.5 4.2 4.13 8.16 3.97
5.52 3.18 1.46 4.57 7.1 4.52 12.13 4.23 3.58 3.93 4.48 6.85
9.19 8.05 5.71 8.9 3.72 7.27 10.13 4.05 6.15 6.83 4.76 4.11
6.46 11.75 7.33 29.54 6.88 5.31 3.75 4.61 3.57 5.7 2.11 6.76
4.81 8.97 6.32 3.89 4.3 4.47 5.61 12.2 7.9 8.07 6.75 10.05
6.58 4.42 1.48 3.79 4.74 9.2 3.98 6.21 6.23 7. 4.41 1.98
2.21 6.09 3.46 5.16 9.27 3.53 12. 5.77 6. 5.42 5.41 9.16
4.28 4.01 7.21 12.08 4.71 3.94 25.48 7.79 7.23 10.08 5.9 4.09
7.37 7.43 9.47 3.45 4.31 2.04 4.93 6.3 5.83 8.1 4.1 4.75
5.44 4.04 7.95 8.35 4.24 2.2 11.67 3.48 4.35 7.3 6.48 5.45
4.49 4.98 9.48 2.05 6.25 3.65 4.55 5.6 9.17 5.27 4.21 6.14
11.68 5.82 6.42 3.35 1.64 10.09 8.44 7.89 7.38 3.73 6.92 5.8
9.42 4.07 4.65 5.46 6.13 3.26 8.08 5.48 9.9 6.6 8.46 7.88
8. 10.56 6.19 10.42 7.55 5.23 9. 5.62 4.77 5.47 4.78 9.09
3.24 4.59 3.02 7.87 4.39 6.2 12.22 7.17 10.04 11.54 4.34 9.81
8.42 2.84 7.25 7.26 7.06 6.12 9.03 3.08 4.46 6.26 5.3 10.03
5.21 2.15 4.22 7.85 10.2 12.24 9.8 7.92 8.82 6.11 2.62 2.41
10.43 4.7 5.51 7.07 7.32 5.29 4.83 7.86 1.96 9.94 7.35 8.11
5.5 2.01 2.19 2.37 5.64 10. 7.51 8.3 8.15 6.37 10.61 5.05
3.86 8.91 10.38 2.38 11.9 4.72 10.39 6.08 2.22 4.25 4.94 4.26
2.24 24.05 6.27 8.49 10.19 5.56 2.39 7.91 4.43 10.01 4.86 5.34
7.81 11.06 11.37 2.07 5.4 8.29 7.09 4.97 7.84 2.17 6.9 4.92
5.86 7.8 5.24 3.33 6.06 3.76 10.35 10.18 11.36 8.8 1.76 5.06
6.4 3.64 4.88 3.2 2.97 2.16 1.91 5.67 5.35 6.84 6.05 5.
9.84 4.27 4. 1.7 11.45 2.87 5.69 5.59 1.77 1.52 6.91 5.04
4.19 4.91 9.86 6.36 6.07 7.76 2.49 9.96 6.38 4.03 7.97 11.65
6.8 10.83 9.97 1.4 2.82 7.74 6.24 5.66 5.26 11.3 2.46 7.82
12 51 2 03 3 1 4 45 7 72 4 4 5 03 5 14 3 22 8 86 4 08 7 56
[Link] 7/14
12/23/23, 10:01 PM Data analysis report - Colaboratory
12.51 2.03 3.1 4.45 7.72 4.4 5.03 5.14 3.22 8.86 4.08 7.56
2.45 7.78 10.28 12.96 4.37 10.95 5.02 7.75 2.02 2.55 4.96 2.83
5.38 3.21 10.85 3.09 9.91 5.73 8.93 13.74 3.78 5.33 3.28 9.02
10.97 2.8 5.43 2.71 4.38 4.8 8.95 3.12 12.29 11.21 2.18 3.38
1.56 10.21 3.83 8.71 8.88 12.44 3.13 11.24 6.66 3.37 4.29 7.77
4.33 1.8 5.28 9.83 3.14 3. 6.1 20. 8.7 11.35 12.55 3.99
9.85 5.07 6.54 10.66 5.97 2.08 4.73 3.04 3.16 9.11 3.5 9.99
11.29 11.26 8.26 5.11 2. 6.49 2.6 7.93 5.63 9.93 10.25 4.36
11.96 11.43 1.44 2.13 11.8 3.52 3.39 22. 10.74 6.68 4.69 5.18
9.98 2.79 8.63 4.62 9.82 11.61 3.62 9.18 9.79 5.2 5.54 10.82
4.67 11.56 3.11 9.95 3.55 11.16 11.27 3.63 3.34 4.66 7.08 3.84
9.72 11.82 7.73 10.59 11.52 3.06 9.44 3.43 7.63 5.81 10.89 5.1
4.16 4.89 6.65 9.76 10.49 9.89 10.52 5.36 6.67 15.35 7.4 9.6
5.72 6.39 2.12 7.14 11.08 4.53 9.26 6.5 9.43 5.17 16.34 4.79
10.33 5.57 10.87 3.41 6.74 4.51 5.32 12.16 5.09 4.12 2.06 4.82
8.43 11.07 9.34 15.72 7.11 10.26 3.69 7.54 7.22 12.6 7.46 9.28
5.12 15.31 11.57 10.34 14.37 4.99 6.55 9.22 3.68 12.62 7.6 6.44
4.02 10.81 4.14 6.33 11. 5.95 5.74 9.21 3.71 12.68 11.2 nan]
************************************************************************************
Item Name
['Paopaojiao (Jingpin)' 'Chinese Cabbage' 'Shanghaiqing' 'Caixin'
'Yunnan Shengcai' 'Sweet Chinese Cabbage' 'High Melon (1)'
'Yunnan Lettuces' 'Xixia Mushroom (1)' 'Green Hot Peppers'
'Red Pepper (1)' 'Amaranth' 'Broccoli' 'Spinach' 'Qinggengsanhua'
'7 Colour Pepper (1)' 'The Red Bell Pepper (1)' 'Green Line Pepper'
'Needle Mushroom (1)' 'Honghu Lotus Root' 'Hongshujian'
'Apricot Bao Mushroom (1)' 'Zhuyecai' 'Huangbaicai (2)'
'Green Eggplant (1)' 'Red Hang Pepper' 'Eggplant (2)' 'Millet Pepper'
'Haixian Mushroom (1)' 'Foreign Garland Chrysanthemum ' 'Bell Pepper (1)'
'Muercai' 'Wawacai' 'Jigu Mushroom (1)' 'Yellow Xincai (1)'
'Luosi Pepper' 'Dongmenkou Xiaobaicai' 'Dalong Eggplant'
'Needle Mushroom (Bag) (1)' 'Ping Mushroom' 'Net Lotus Root (1)'
'Nanguajian' 'The Crab Flavor Mushroom (2)' 'Niushou Youcai'
'The White Mushroom (2)' 'Haixian Mushroom (Bag) (1)'
'Jigu Mushroom (Bunch)' 'Lotus (Ea)' 'Red Hot Peppers' nan]
************************************************************************************
Category Code
[1.0110105e+09 1.0110101e+09 1.0110104e+09 1.0110108e+09 1.0110102e+09
1.0110105e+09 nan]
************************************************************************************
Category Name
['Capsicum' 'Flower/Leaf\xa0Vegetables' 'Aquatic Tuberous Vegetables'
'Edible Mushroom' 'Cabbage' 'Solanum' nan]
************************************************************************************
Loss Rate (%)
[ 7.08 22.27 14.43 13.7 15.25 9.43 29.25 12.81 13.82 6.72 11.76 18.52
9.26 18.51 17.06 15.98 8.93 7.8 3.43 24.05 8.42 5.05 13.62 15.61
5.01 9.99 6.07 5.86 9.89 26.16 7.59 16.33 7.61 2.48 8.99 10.64
10.18 27.84 10.94 8.85 11.6 5.54 13.46 16.04 12.17 7.3 5.96 12.42
6.73 nan]
************************************************************************************
total_sales
[3.0096 2.7168 3.1084 ... 3.003 7.018 nan]
************************************************************************************

keyboard_arrow_down Q2: Calculator using operators and if else statements

[Link] 8/14
12/23/23, 10:01 PM Data analysis report - Colaboratory
num1 = int(input("Enter first number: "))
num2 = int(input("Enter second number: "))
operator = input("Enter operator (+, -, *, /): ")

if operator == "+":
result = num1 + num2
print(f"The result is {result}")
elif operator == "-":
result = num1 - num2
print(f"The result is {result}")
elif operator == "*":
result = num1 * num2
print(f"The result is {result}")
elif operator == "/":
if num2 == 0:
print("Division by zero is not allowed.")
else:
result = num1 / num2
print(f"The result is {result}")
else:
print("Invalid operator.")

Enter first number: 8


Enter second number: 9
Enter operator (+, -, *, /): +
The result is 17

keyboard_arrow_down Q3: Placement Data analysis


import pandas as pd
import [Link] as plt
import seaborn as sns

df = pd.read_csv("/content/Placement_Data (1).csv")

print([Link]())

feature_names = [Link]
print("Feature Names:", feature_names)

num_records, num_columns = [Link]


print("Number of Records:", num_records)
print("Number of Columns:", num_columns)

print([Link]())

avg_10th_percentage = df['ssc_p'].mean()
print("Average 10th Grade Percentage:", avg_10th_percentage)

max_10th_percentage = df['ssc_p'].max()
print("Max 10th Grade Percentage:", max_10th_percentage)

toppers_10th = df[df['ssc_p'] == max_10th_percentage]


print("Number of 10th Grade Toppers:", len(toppers_10th))

highest_ssc_placement_status = toppers_10th['status'].values[0]
print("Placement Status of Highest 10th Grade Scorer:", highest_ssc_placement_status)

placement_counts = df['status'].value_counts()
print("Number of Placed Students:", placement_counts['Placed'])
print("Number of Unplaced Students:", placement_counts['Not Placed'])

unique_degrees = df['degree_t'].nunique()
print("Number of Unique Degrees:", unique_degrees)

[Link] 9/14
12/23/23, 10:01 PM Data analysis report - Colaboratory
correlation_10th_12th = df['ssc_p'].corr(df['hsc_p'])
print("Correlation between 10th and 12th Percentage:", correlation_10th_12th)

correlation_matrix = [Link]()
print("Correlation Matrix:")
print(correlation_matrix)

column_to_remove = 'sl_no'

df = [Link](column_to_remove, axis=1)

print("Number of Null Values in Each Column:")


print([Link]().sum())

df = [Link]([Link]())

print("Number of Null Values in Each Column After Filling:")


print([Link]().sum())

[Link](figsize=(10, 6))
[Link](x='ssc_p', y='hsc_p', data=df)
[Link]('Scatter Plot between 10th and 12th Percentage')
[Link]('10th Percentage')
[Link]('12th Percentage')
[Link]()

[Link](figsize=(10, 6))
[Link](x='ssc_p', y='hsc_p', hue='status', data=df)
[Link]('Scatter Plot based on Placement Status')
[Link]('10th Percentage')
[Link]('12th Percentage')
[Link]()

[Link](figsize=(8, 5))
[Link](x='ssc_p', data=df)
[Link]('Boxplot for 10th Percentage')
[Link]()

[Link](figsize=(8, 5))
[Link](x='hsc_p', data=df)
[Link]('Boxplot for 12th Percentage')
[Link]()

[Link](figsize=(10, 6))
[Link](x='status', y='hsc_p', data=df)
[Link]('Boxplot for 12th Percentage based on Placement Status')
[Link]()

[Link](figsize=(12, 8))
[Link](data=df[['ssc_p', 'hsc_p', 'degree_p', 'mba_p']])
[Link]('Lineplot for 10th, 12th, Degree, and MBA Percentage')
[Link]()

continuous_columns = df.select_dtypes(include=['float64']).columns
correlation_continuous = df[continuous_columns].corr()

[Link](figsize=(10, 8))
[Link](correlation_continuous, annot=True, cmap='coolwarm', fmt=".2f")
[Link]('Heatmap of Correlation between Continuous Columns')
[Link]()

[Link] 10/14
12/23/23, 10:01 PM Data analysis report - Colaboratory

sl_no gender ssc_p ssc_b hsc_p hsc_b hsc_s degree_p \


0 1 M 67.00 Others 91.00 Others Commerce 58.00
1 2 M 79.33 Central 78.33 Others Science 77.48
2 3 M 65.00 Central 68.00 Central Arts 64.00
3 4 M 56.00 Central 52.00 Central Science 52.00
4 5 M 85.80 Central 73.60 Central Commerce 73.30

degree_t workex etest_p specialisation mba_p status salary


0 Sci&Tech No 55.0 Mkt&HR 58.80 Placed 270000.0
1 Sci&Tech Yes 86.5 Mkt&Fin 66.28 Placed 200000.0
2 Comm&Mgmt No 75.0 Mkt&Fin 57.80 Placed 250000.0
3 Sci&Tech No 66.0 Mkt&HR 59.43 Not Placed NaN
4 Comm&Mgmt No 96.8 Mkt&Fin 55.50 Placed 425000.0
Feature Names: Index(['sl_no', 'gender', 'ssc_p', 'ssc_b', 'hsc_p', 'hsc_b', 'hsc_s',
'degree_p', 'degree_t', 'workex', 'etest_p', 'specialisation', 'mba_p',
'status', 'salary'],
dtype='object')
Number of Records: 215
Number of Columns: 15
<class '[Link]'>
RangeIndex: 215 entries, 0 to 214
Data columns (total 15 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 sl_no 215 non-null int64
1 gender 215 non-null object
2 ssc_p 215 non-null float64
3 ssc_b 215 non-null object
4 hsc_p 215 non-null float64
5 hsc_b 215 non-null object
6 hsc_s 215 non-null object
7 degree_p 215 non-null float64
8 degree_t 215 non-null object
9 workex 215 non-null object
10 etest_p 215 non-null float64
11 specialisation 215 non-null object
12 mba_p 215 non-null float64
13 status 215 non-null object
14 salary 148 non-null float64
dtypes: float64(6), int64(1), object(8)
memory usage: 25.3+ KB
None
Average 10th Grade Percentage: 67.30339534883721
Max 10th Grade Percentage: 89.4
Number of 10th Grade Toppers: 1
Placement Status of Highest 10th Grade Scorer: Placed
Number of Placed Students: 148
Number of Unplaced Students: 67
Number of Unique Degrees: 3
Correlation between 10th and 12th Percentage: 0.5114721015997723
Correlation Matrix:
sl_no ssc_p hsc_p degree_p etest_p mba_p salary
sl_no 1.000000 -0.078155 -0.085711 -0.088281 0.063636 0.022327 0.063764
ssc_p -0.078155 1.000000 0.511472 0.538404 0.261993 0.388478 0.035330
hsc_p -0.085711 0.511472 1.000000 0.434206 0.245113 0.354823 0.076819
degree_p -0.088281 0.538404 0.434206 1.000000 0.224470 0.402364 -0.019272
etest_p 0.063636 0.261993 0.245113 0.224470 1.000000 0.218055 0.178307
mba_p 0.022327 0.388478 0.354823 0.402364 0.218055 1.000000 0.175013
salary 0.063764 0.035330 0.076819 -0.019272 0.178307 0.175013 1.000000
Number of Null Values in Each Column:
gender 0
ssc_p 0
ssc_b 0
hsc_p 0
hsc_b 0
hsc_s 0
degree_p 0
degree_t 0
workex 0
etest_p 0
specialisation 0
mba_p 0
status 0
salary 67
dtype: int64
Number of Null Values in Each Column After Filling:
gender 0
ssc_p 0
ssc_b 0
hsc_p 0
hsc_b 0
hsc_s 0
degree_p 0
degree_t 0
k 0
[Link] 11/14
12/23/23, 10:01 PM Data analysis report - Colaboratory
workex 0
etest_p 0
specialisation 0
mba_p 0
status 0
salary 0
dtype: int64
<ipython-input-28-54e52fe3bd85>:51: FutureWarning: The default value of numeric_only in
correlation_matrix = [Link]()
<ipython-input-28-54e52fe3bd85>:65: FutureWarning: The default value of numeric_only in
df = [Link]([Link]())

[Link] 12/14
12/23/23, 10:01 PM Data analysis report - Colaboratory

[Link] 13/14
12/23/23, 10:01 PM Data analysis report - Colaboratory

[Link] 14/14

You might also like