0% found this document useful (0 votes)
14 views6 pages

Nested Grouping in Pandas

The document contains a Jupyter notebook that demonstrates data analysis using Python libraries such as pandas and seaborn on a dataset called 'tips.csv'. It includes various visualizations like boxplots and countplots to analyze total bills based on time, day, and smoking status. Additionally, it performs group aggregations to summarize data by time and smoker status.

Uploaded by

sushantsaini3333
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views6 pages

Nested Grouping in Pandas

The document contains a Jupyter notebook that demonstrates data analysis using Python libraries such as pandas and seaborn on a dataset called 'tips.csv'. It includes various visualizations like boxplots and countplots to analyze total bills based on time, day, and smoking status. Additionally, it performs group aggregations to summarize data by time and smoker status.

Uploaded by

sushantsaini3333
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

7/9/24, 5:37 PM Untitled2

In [1]: import numpy as np


import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [2]: data=pd.read_csv("tips.csv")
data.head(3)

Out[2]: total_bill tip sex smoker day time size

0 16.99 1.01 Female No Sun Dinner 2

1 10.34 1.66 Male No Sun Dinner 3

2 21.01 3.50 Male No Sun Dinner 3

In [3]: sns.boxplot(x=data["time"],y=data["total_bill"])

<Axes: xlabel='time', ylabel='total_bill'>


Out[3]:

In [4]: sns.boxplot(x=data["time"],y=data["total_bill"],hue=data["smoker"])

<Axes: xlabel='time', ylabel='total_bill'>


Out[4]:

localhost:8888/lab/tree/Desktop/Pranshi/Untitled2.ipynb 1/6
7/9/24, 5:37 PM Untitled2

In [5]: sns.countplot(x=data["time"])

<Axes: xlabel='time', ylabel='count'>


Out[5]:

localhost:8888/lab/tree/Desktop/Pranshi/Untitled2.ipynb 2/6
7/9/24, 5:37 PM Untitled2

In [6]: sns.countplot(x=data["time"],hue=data["smoker"])

<Axes: xlabel='time', ylabel='count'>


Out[6]:

In [7]: sns.boxplot(x=data["day"],y=data["total_bill"])

<Axes: xlabel='day', ylabel='total_bill'>


Out[7]:

localhost:8888/lab/tree/Desktop/Pranshi/Untitled2.ipynb 3/6
7/9/24, 5:37 PM Untitled2

In [8]: sns.boxplot(x=data["day"],y=data["total_bill"],hue=data["sex"])

<Axes: xlabel='day', ylabel='total_bill'>


Out[8]:

localhost:8888/lab/tree/Desktop/Pranshi/Untitled2.ipynb 4/6
7/9/24, 5:37 PM Untitled2

In [9]: sns.boxplot(x=data["day"],y=data["total_bill"],hue=data["smoker"])

<Axes: xlabel='day', ylabel='total_bill'>


Out[9]:

In [ ]:

In [11]: data.groupby("time")["total_bill"].aggregate(["count"])

Out[11]: count

time

Dinner 176

Lunch 68

In [16]: data.groupby("time")["total_bill"].get_group("Dinner").groupby(data["smoker"]).aggrega

Out[16]: count

smoker

No 106

Yes 70

In [17]: data.groupby("time")["total_bill"].get_group("Lunch").groupby(data["smoker"]).aggregat

localhost:8888/lab/tree/Desktop/Pranshi/Untitled2.ipynb 5/6
7/9/24, 5:37 PM Untitled2

Out[17]: count

smoker

No 45

Yes 23

In [18]: data.groupby("time")["total_bill"].get_group("Lunch").groupby(data["smoker"]).aggregat

Out[18]: count mean median

smoker

No 45 17.050889 15.95

Yes 23 17.399130 16.00

In [19]: data.groupby("time").groups

{'Dinner': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
Out[19]:
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 4
2, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 6
3, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 90, 91, 92, 93, 94, 95, 96, 9
7, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, ...], 'Lu
nch': [77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 117, 118, 119, 120, 121, 1
22, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 1
39, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 191, 192, 193, 194, 195, 196, 1
97, 198, 199, 200, 201, 202, 203, 204, 205, 220, 221, 222, 223, 224, 225, 226]}

In [20]: data.groupby(["time","smoker"]).aggregate(["count"])

Out[20]: total_bill tip sex day size

count count count count count

time smoker

Dinner No 106 106 106 106 106

Yes 70 70 70 70 70

Lunch No 45 45 45 45 45

Yes 23 23 23 23 23

In [21]: data.groupby(["time","smoker"])["total_bill"].aggregate(["count"])

Out[21]: count

time smoker

Dinner No 106

Yes 70

Lunch No 45

Yes 23

localhost:8888/lab/tree/Desktop/Pranshi/Untitled2.ipynb 6/6

You might also like