0% found this document useful (0 votes)

170 views44 pages

Pandas Notes

Uploaded by

krithikb87

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

170 views44 pages

Pandas Notes

Uploaded by

krithikb87

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

Data analysis made simple: Python Pandas

Python for Data Analysis  Pandas

The Pandas library is one of the most important and popular tools for Python data scien sts and
analysts, as it is the backbone of many data projects.

Pandas is an open-source Python package for data cleaning and data manipula on. It provides
extended, ﬂexible data structures to hold diﬀerent types of labelled and rela onal data.

Pandas is built on the NumPy package, so a lot of the structure between them is similar. Pandas is
also used in SciPy for sta s cal analysis or with Matplotlib for plo ng func ons.

Using Pandas, you can do things like:

 Easily calculate sta s cs about data such as ﬁnding the average, distribu on, and median of
columns

 Use data visualiza on tools, such as Matplotlib, to easily create plot bars, histograms, and
more

 Clean your data by ﬁltering columns by par cular criteria or easily removing values

 Manipulate your data ﬂexibly using opera ons like merging, joining, reshaping, and more

 Read, write, and store your clean data as a database, txt ﬁle, or CSV ﬁle

Installing Pandas

You can install Pandas using the built-in Python tool pip and run the following command.

$ pip install pandas

Data types available to us in Pandas, also called dtypes.

 object: text or mixed numeric or non-numeric values

 int64: integer numbers

 bool: true/false vaues

 ﬂoat64: ﬂoa ng point numbers

 category: ﬁnite list of text values

 date me64: Date and me values

 medelta[ns]: diﬀerences between two date mes

A data structure is a par cular way of organizing our data. Pandas has two data structures, and all
opera ons are based on those two objects:

 Series

 DataFrame

Series are the columns, and the DataFrame is a table composed of a collec on of series. Series can
be best described as the single column of a 2-D array that can store data of any type.

DataFrame is like a table that stores data similar to a spreadsheet using mul ple columns and rows.
Each value in a DataFrame object is associated with a row index and a column index.
Pandas data structures below with some addi onal annota on.

We create series by invoking the pd.Series() method and then passing a list of values.

Pandas will, by default, count index from 0. We then explicitly deﬁne those values.

srs.values func on on line 9 returns the values stored in the Series object, and the
func on srs.index.values on line 13 returns the index values.
Assign names to our values

Each index corresponds to its value in the Series object. Let’s look at an example where we assign a
country name to popula on growth rates.

Example:

#importing pandas in our program

import pandas as pd

# Defining a series object

srs = pd.Series([11.9, 36.0, 16.6, 21.8, 34.2], index = ['China', 'India', 'USA', 'Brazil', 'Pakistan'])

# Set Series name

srs.name = "Growth Rate"

# Set index name

srs.index.name = "Country"

# printing series values

print("The Indexed Series values are:")
print(srs)

The a ribute srs.name sets the name of our series object. The a ribute srs.index.name then sets the
name for the indexes.

Select entries from a Series

We select elements based on the index name or index number.
import numpy as np
import pandas as pd

srs = pd.Series(np.arange(0, 6, 1), index = ['ind0', 'ind1', 'ind2', 'ind3', 'ind4', 'ind5'])
srs.index.name = "Index"
print("The original Series:\n", srs)

print("\nSeries element at index ind3:")

print(srs['ind3']) # Fetch element at index named ind3

print("\nSeries element at index 3:")

print(srs[3]) # Fetch element at index 3

print("\nSeries elements at multiple indexes:\n")

print(srs[['ind1', 'ind4']]) # Fetch elements at multiple indexes

The elements from the Series are selected in 3 ways.

 On line 9, the element is selected based on the index name.
 On line 12, the element is selected based on the index number. Keep in mind that index
numbers start from 0.
 On line 15, mul ple elements are selected from the Series by selec ng mul ple index names
inside the [].

Drop entries from a Series

Dropping and unwanted index is a common func on in Pandas. If the drop(index_name) func on is
called with a given index on a Series object, the desired index name is deleted.
import numpy as np
import pandas as pd

srs = pd.Series(np.arange(0, 6, 1), index = ['ind0', 'ind1', 'ind2', 'ind3', 'ind4', 'ind5'])
srs.index.name = "Index"
print("The original Series:\n", srs)

srs = srs.drop('ind2') # drop index named ind2

print("The New Series:\n", srs)

The output that the ind2 index is dropped. Also, an index can only be dropped by specifying the
index name and not the number. So, srs.drop(srs[2]) does not work.
DataFrame: the most important opera ons
Using the pandas.DataFrame() func on
To create a pandas dataframe from a numpy array, pass the numpy array as an argument to
the pandas.DataFrame() func on. You can also pass the index and column labels for the
dataframe. The following is the syntax:

df = pandas.DataFrame(data=arr, index=None, columns=None)

There are several ways to make a DataFrame in Pandas. The easiest way to create one from scratch is
to create and print a df.

We can also create a dict and pass our dic onary data to the DataFrame constructor. Say we have
some data on vegetable sales and want to organize it by type of vegetable and quan ty. Our data
would look like this:

And now we pass it to the constructor using a simple command.

Each item, or value, in our data will correspond with a column in the DataFrame we created, just like
a chart. The index for this DataFrame is listed as numbers, but we can specify them further
depending on our needs. Say we wanted to know quan ty per month. That would be our new index.
We do that using the following command.
Get info about your data
One of the ﬁrst commands you run a er loading your data is .info(), which provides all the essen al
informa on about a dataset.

You can access more informa on with other opera ons, like .shape, which outputs a tuple of (rows,
columns).
We use the .columns operator to print a dataset’s column names.
quan ty.columns
You can then rename your columns easily. On top of that, the .rename() method allows us to rename
columns.
quan ty.rename(columns = {'carrots':'bananas'})

Searching and selec ng in our DataFrame

We need to know how to manipulate or access the data in our DataFrame, such as selec ng,
searching, or dele ng data values. You can do this either by column or by row. Let’s see how it’s
done. The easiest way to select a column of data is by using brackets [ ]. We can also use brackets to
select mul ple columns. Say we only wanted to look at June’s vegetable quan ty.

Note: loc and iloc are used for loca ng data.

 .iloc locates by numerical index
 .loc locates by the index name. This is similar to list slicing in Python.
Pandas DataFrame object also provides methods to select speciﬁc columns. The following example
shows how it can be done.
quan ty['peppers']

Create a new DataFrame from pre-exis ng columns

We can also grab mul ple columns and create a new DataFrame object from it.

Reindex data in a DataFrame

We can also reindex the data either by the indexes themselves or the columns. Reindexing
with reindex() allows us to make changes without messing up the ini al se ng of the objects.
Note: The rules for reindexing are the same for Series and DataFrame objects.

#impor ng pandas in our program

import pandas as pd

# Deﬁning a series object

srs1 = pd.Series([11.9, 36.0, 16.6, 21.8, 34.2], index = ['China', 'India', 'USA', 'Brazil', 'Pakistan'])

# Set Series name

srs1.name = "Growth Rate"

# Set index name

srs1.index.name = "Country"

srs2 = srs1.reindex(['China', 'India', 'Malaysia', 'USA', 'Brazil', 'Pakistan', 'England'])

print("The series with new indexes is:\n",srs2)

srs3 = srs1.reindex(['China', 'India', 'Malaysia', 'USA', 'Brazil', 'Pakistan', 'England'], ﬁll_value=0)

print("\nThe series with new indexes is:\n",srs3)

On line 11, the indexes are changed. The new index name is added between Row2 and Row4. One
line 14, the columns keyword should be speciﬁcally used to reindex the columns of DataFrame. The
rules are the same as for the indexes. NaN values were assigned to the whole column by default.

How to read or import Pandas data

It is quite easy to read or import data from other ﬁles using the Pandas library. In fact, we can use
various sources, such as CSV, JSON, or Excel to load our data and access it.

Reading and impor ng data from CSV ﬁles

We can import data from a CSV file, which is common prac ce for Pandas users.
We simply create or open CSV file, copy the data, paste it in our Notepad, and save it in the same
directory that houses your Python scripts.
You then use a bit of code to read the data using the read_csv func on build into Pandas.
read_csv will generate the index column as a default, so we need to change this for the first column
is the index column. We can do this by passing the parameter index_col to tell Pandas which column
to index.

Once we’ve used Pandas to sort and clean data, we can then save it back as the original ﬁle with
simple commands. You only have to input the ﬁlename and extension.

Reading and impor ng data from JSON ( JSON (JavaScript Object Nota on))
Examples:
{
"glossary": {
" tle": "example glossary",
"GlossDiv": {
" tle": "S",
"GlossList": {
"GlossEntry": {
"ID": "SGML",
"SortAs": "SGML",
"GlossTerm": "Standard Generalized Markup Language",
"Acronym": "SGML",
"Abbrev": "ISO 8879:1986",
"GlossDef": {
"para": "A meta-markup language, used to create markup languages such as DocBook.",
"GlossSeeAlso": ["GML", "XML"]
},
"GlossSee": "markup"
}
}
}
}
}

Say you have a JSON file. A JSON file is basically like a stored Python dict, so Pandas can easily access
and read it using the read_json func on. Let’s look at an example.
Reading and impor ng data from Excel file
Say you have an Excel file. You can similarly use the read_excel func on to access and read that data.

Once we call the read_excel func on, we pass the name of the Excel file as our argument,
so read_excel will open the file’s data. We can the print() to display the data. If we want to go one
step further, we can add the loc() method from earlier, allowing us to read specific rows and columns
of our file.

Data Wrangling with Pandas (Combining DataFrames)

Once we have our data, we can use data wrangling processes to manipulate and prepare data for the
analysis. The most common data wrangling processes are merging, concatena on, and grouping.

Merge method Descrip on

le Use keys from le frame only

right Use keys from right frame only

outer Use union of keys from both frames

inner Use intersec on of keys from both frames

merge()
We can Join or merge two data frames in pandas python by using the merge() func on. The diﬀerent
arguments to merge() allow you to perform natural join, le join, right join, and full outer join in
pandas. We have also other type join or concatenate opera ons like join based on index, Row index
and column index.

Join or Merge in Pandas – Syntax:

merge(le _df, right_df, on=’Customer_id’, how=’inner’)

le _df – Dataframe1
right_df– Dataframe2.
on− Columns (names) to join on. Must be found in both the le and right DataFrame objects.
how – type of join needs to be performed – ‘le ’, ‘right’, ‘outer’, ‘inner’, Default is inner join
The data frames must have same column names on which the merging happens. Merge() Func on in
pandas is similar to database join opera on in SQL.
UNDERSTANDING THE DIFFERENT TYPES OF JOIN OR MERGE IN PANDAS:
 Inner Join or Natural join: To keep only rows that match from the data frames, specify the
argument how=‘inner’.
 Outer Join or Full outer join:To keep all rows from both data frames, specify how=‘outer’.
 Le Join or Le outer join:To include all the rows of your data frame x and only those
from y that match, specify how=‘le ’.
 Right Join or Right outer join:To include all the rows of your data frame y and only those
from x that match, specify how=‘right’.

Example:
import pandas as pd
import numpy as np

# data frame 1
d1 = {'Customer_id':pd.Series([1,2,3,4,5,6]),
'Product':pd.Series(['Oven','Oven','Oven','Television','Television','Television'])}
df1 = pd.DataFrame(d1)

# data frame 2
d2 = {'Customer_id':pd.Series([2,4,6,7,8]),
'State':pd.Series(['California','California','Texas','New York','Indiana'])}
df2 = pd.DataFrame(d2)

df1:

df2:

Inner join pandas:

Return only the rows in which the le table have matching keys in the right table

Example:
#inner join in python pandas
inner_join_df= pd.merge(df1, df2, on='Customer_id', how='inner')
inner_join_df

the resultant data frame df will be

Outer join in pandas:

Returns all rows from both tables, join records from the le which have matching keys in the right
table.When there is no Matching from any table NaN will be returned
Example:
# outer join in python pandas
outer_join_df=pd.merge(df1, df2, on='Customer_id', how='outer')
outer_join_df

the resultant data frame df will be

Le outer Join or Le join pandas:

Return all rows from the le table, and any rows with matching keys from the right table.When there
is no Matching from right table NaN will be returned

Example:
# le join in python
le _join_df= pd.merge(df1, df2, on='Customer_id', how='le ')
le _join_df
the resultant data frame df will be

Right outer join or Right Join pandas:

Return all rows from the right table, and any rows with matching keys from the le table.

Example:
# right join in python pandas
right_join_df= pd.merge(df1, df2, on='Customer_id', how='right')
right_join_df

the resultant data frame df will be

Pandas - Joining DataFrames with Concat and Append

It is frequently required to join dataframes together, such as when data is loaded from mul ple ﬁles
or even mul ple sources. pandas.concat() is used to add the rows of mul ple dataframes together
and produce a new dataframe with the the combined data.

concat
The Pandas.concat func on joins a number of dataframes on one of the axis. The default is to join
along the index.
Concatenate the two dataframes together to join along the index.
Pandas - concat by default joins two dataframes along the index

Pandas.concat Parameters:

Parameter Note Default

objs list of DataFrame or Series objects

axis the axis to concatenate along, (0 = ’index’, 1 = ’columns’) 0

join how to handle indexes on other axis, (op ons are ‘inner’ or ‘outer’) 'outer'

ignore_index boolean value on preserving source index False

keys sequence used to create hierarchical index using the passed keys None

levels list of sequences used to create a Mul Index None

names list of names for the levels in hierarchical index None

verify_integrityboolean value to specify whether the new concatenated axis contains duplicates False

boolean value to specify sor ng non-concatena on axis if it is not already aligned when join
sort False
is ‘outer’

copy boolean value to specify whether data is copied unnecessarily True

concat with diﬀerent column names

concat with axis = 1

The concat func on has a number of parameters, which have defaults. the axis parameter speciﬁes
along which to join the dataframes, o for index (default) and 1 for columns.

Pandas - concat() with index = 1 joins two dataframes along the columns
concat with inner join

Pandas - concatenate two dataframes with inner join only keeps matching indexes

Concatena ng mul ple dataframes

More than two dataframes can be concatenated together. The default is to concatenate along the
index.

Pandas - concatenate mul ple dataframes along index

Mul ple dataframes can also be concatenated along the columns with axis=1.

Pandas - concatenate mul ple dataframes along columns

Dataframe.Append
instance method performs the same func on as concat by appending a Series or Dataframe onto the
end of the calling dataframe and returning a new dataframe.
groupby() func on in pandas
Pandas DataFrame.groupby() func on is used to collect iden cal data into groups and
perform aggregate func ons on the grouped data. Group by opera on involves spli ng the data,
applying some func ons, and ﬁnally aggrega ng the results.

In Pandas, you can use groupby() with the combina on

of sum(), count(), pivot(), transform(), aggregate(), and many more methods to perform various
opera ons on grouped data.

Aggrega on-Func on Descrip on

sum() Sum of values

mean() Mean (average) of values

min() / max() Minimum / Maximum value

std() / var() Standard devia on / Variance

Aggrega on-Func on Descrip on

count() Count of non-missing values

nunique() Number of unique values

cumsum() Cumula ve sum

agg() Apply mul ple aggrega on func ons

The 'groupby' func on is commonly used in data analysis. It is used to gain insights into the
rela onship between variables.

Key Points –
 groupby() is used to split data into groups based on one or more keys, allowing for efficient
analysis and aggrega on of grouped data.
 It supports various aggrega on func ons like sum, mean, count, min, and max, which can
be applied to each group.
 You can apply mul ple aggrega ons on different columns using .agg(), offering more
flexibility in analysis.
 The result of groupby() o en returns a DataFrame with a Mul Index, where each level
represents a grouping key.
 You can filter groups based on specific condi ons by using .filter() a er groupby().
 groupby() allows itera on over groups, enabling customized opera ons on each subset of
data.

The syntax for 'groupby()' is as follows:

Parameters of Pandas DataFrame.groupby()

 by – List of column names to group by

 axis – Default to 0. It takes 0 or ‘index’, 1 or ‘columns’
 level – Used with Mul Index.
 as_index – sql style grouped output.
 sort – Default to True. Specify whether to sort a er the group
 group_keys – add group keys or not
 squeeze – deprecated in new versions
 observed – This only applies if any of the groupers are Categoricals.
 dropna – Default to False. Use True to drop None/Nan on group keys

Example:

output.

Apply the groupby() func on along with the sum() func on to perform the sum opera on on grouped
data.
output.

groupby() on Two or More Columns

Example:

Add Index to the Grouped Data

By default groupby() func on doesn’t return the row Index, you can add the index using
the DataFrame.reset_index() method.

Example:

Drop NA /None/Nan (on group key) from the Result

You can also choose whether to include NA/None/Nan in group keys or not by
se ng dropna parameter. By default the value of dropna set to True. So, it does not include
None/Nan values on the group keys set dropna=False parameter.

Example:
Sort groupby() result by Group Key
To remove sor ng on grouped results in pandas, you can pass sort=False parameter to
the groupby() func on. By passing sort=False to the groupby() func on, you ensure that the grouped
results are not sorted by the group key, preserving the original order of appearance of the courses in
the DataFrame.

To sort the group keys (courses) in descending order a er performing the groupby() opera on, you
can use the sort_index() method with the ascending=False parameter.

This code first groups the DataFrame by Courses, calculates the sum of each group, and then sorts
the group keys (courses) in descending order using the sort_index() method with ascending=False.
Apply More Aggrega ons
compute mul ple aggrega ons at the same me in grouped data simply bypassing the list of
aggregate func ons to the aggregate().
Example:
To compute different aggrega ons on different columns in a grouped DataFrame, you can pass a
dic onary to the agg() func on specifying the aggrega on func on for each column. Here, calculates
the count on the Dura on grouped column and calculates min and max on the Fee grouped column.

Pandas Handling Missing Data in DataFrame

What is Missing Data?
In the world of Data Science, a Pandas DataFrame is the most popular and globally accepted data
structure for storing large-scale data in the form of rows and columns just like an excel spreadsheet
or SQL table. A DataFrame can contain almost any type of data, however, the missing data in a
DataFrame is refer to the values that are unavailable.
Example of Missing Data in a Pandas DataFrame
The word “Missing Data in a DataFrame” simply means the values that are unavailable or missing in a
Pandas DataFrame. Values that are missing in a DataFrame are automa cally replaced by
the NaN type (Here NaN is used from NumPy). In the following example, we have two missing
values in a DataFrame which is replaced by the “NaN” value

Why Should You Handle Missing Data in DataFrame?

In the process of exploratory data analysis, one of the most important steps is data preprocessing
where you will be mainly dealing with missing data handling. Before looking into the insights of data
you need a clean dataset, free of outliers and missing values.
You need to handle missing data in a Pandas DataFrame because
1. Missing values in a DataFrame nega vely aﬀect the data insights
2. Training a Machine Learning model needs a clean dataset
3. DataFrame with missing values is hard to process, visualize and create a data pipeline
So, you need to ﬁnd out the missing data in your DataFrame and get rid of missing values.

How to Find Missing Data in a DataFrame?

Use func ons like isna() or isnull() to detect missing values. Pair them with sum() to count missing
entries.

1. Find Rows Having NaN Values

Example:
Output:

2: Find Columns Having NaN Values

3: Find Percentage of Missing Data in Column

Here, DataFrame.isna() is used to check if the DataFrame has NA values.
Output:

4: Find Number of NaN Values in Each Row w.r.t Column

Output:

Diﬀerent Methods to Handle Missing Data in a DataFrame

Based on the data you are working with, you may have to follow any of the following diﬀerent
techniques for handling missing data in a DataFrame. Review all of the methods and apply the one
which suits best your need.
The best ways to handle missing data in a DataFrame are:
1. Remove rows or columns from the DataFrame that have missing data
2. Replace the missing data with another value

1. Remove Rows or Columns Having Missing Data

We can simply ﬁnd out rows, or columns where we have missing data and drop them by using
Pandas func ons.

1.1 Removing Rows Having Missing Data

In Pandas, we can use the func on df.dropna() to remove all rows that have missing data.

1.2 Removing Columns Having Missing Data

Just like removing rows, we can also remove columns from our DataFrame that have missing data.
The same pandas built-in func on, df.dropna() can be used with an extra “axis” parameter.
2. Replace Missing Data in DataFrame
This method is a bit tedious yet a more powerful and op mis c way to handle missing data in
DataFrame. You will have a lot of ways to replace the missing data in the DataFrame.

To replace missing data in a DataFrame you can use the following diﬀerent methods:
1. Replace missing data with ﬁxed values in DataFrame
2. Replace missing data with Mean value
3. Replace missing data with Median value

2.1 Replace Missing Data with Fixed Values in dataFrame

We can impute the missing values in the dataFrame by a ﬁxed value. The ﬁxed value can be
an Integer or any other data depending on the nature of your Dataset. For example, if you
are dealing with gender data, you can replace all the missing values with the word
“unknown”, “Male”, or “Female”.
 Pandas Replace NaN with 0
 Pandas Replace NaN with empty String

Imputed all missing values by a random number, generated using the python random module.

2.2 Replace Missing Data with Mean Value

You can use the mean values to replace the missing values in case the data distribu on is symmetric.
You have a choice to choose between the three sta s cs func ons either mean mode, or Median. It
strongly depends on the dataset you working on.
Example:

Pivot Tables in Pandas

Pivot tables are tables of grouped values that aggregate speciﬁc items of an original table into one or
more discrete categories. They are a way of crea ng short summaries of your original dataset that
display things such as sums of columns, averages, or any other sta s c value you are interested in. By
summarizing large amounts of data into pivot tables, usually no ce some pa erns in it which helps
deduce how your data behaves based on certain factors. This knowledge is very useful because it can
help subject ma er experts make be er strategic decisions.

Key Diﬀerences:

Feature Pivot Tables groupby merge concat

Reshape and Group and Combine data

Purpose Stack/append data
summarize data aggregate based on a key

Requires
Yes Yes No No
Aggrega on

Reshapes Data Yes (grid format) No No No

Key for
No Grouping key(s) Common key(s) Not required
Combining

Mul -dimensional Aggrega ng Adding new rows or

Use Case Joining datasets
summary analysis column values columns

The pivot_table() func on in Pandas allows us to create a spreadsheet-style pivot table making it
easier to group and analyze our data.

To create a pivot table using this method you need to deﬁne values for the following parameters:
 Index
 Columns (op onal)
 Values
 Aggfunc

The index parameter defines what is going to be the index of your pivot table. For example, it defines
how the rows of your original DataFrame are going to be grouped into categories. If you input a list
of values instead of just one value, you are going to end up with a mul -index as your row index.
The columns parameter is an op onal parameter that allows you to introduce an extra value to your
columns index, which in turn transforms your pivot table column index into a mul -index.
The values parameter defines which columns you want to aggregate. Essen ally it tells Pandas what
it needs to aggregate based on some aggrega on func on a er your data has been grouped based
on the values you entered for the index parameter.
The aggfunc parameter defines which type of aggrega on you want to perform. Based on what you
decide to use here, you can access various informa on such as the means, the sums, etc. If you want
to, you can also enter mul ple values here which will end up transforming your column index into a
mul -index.

Example:

Output:
In this example, we reshaped the DataFrame
with Date as index, City as columns and Temperature as values.

pivot_table() based on the following syntax:

 index - keys to group by on the pivot table index

 columns - keys to group by on the pivot table column
 values - columns used for aggrega on data of the pivot table
 aggfunc - func ons or list of func ons used for aggrega on

pivot_table() with Mul ple Values

Output:
In this example, we created a pivot table for mul ple values i.e. Temperature and Humidity.

pivot_table() With Aggregate Func ons

We can use the pivot_table() method with diﬀerent aggregate func ons using
the aggfunc parameter. We can set the value of aggfunc to func ons such
as 'sum', 'mean', 'count', 'max' or 'min'.
Let's see an example.

In the above example, we calculated the mean temperature of each city using
the aggfunc='mean' argument in pivot_table().

Pivot Table With Mul Index

We can create a pivot table with Mul Index using the pivot_table() func on.
Let's look at an example.
In this example, we created a pivot table with a Mul Index by passing a list of columns as
an index argument.
A Mul Index contains mul ple levels of indexes with columns linked to one another through a
parent/rela onship. Here, Country is the parent column and City is the child column.

Advanced Pivot op ons

Alterna vely, we may use more op ons with the following default values:

Useful pivot op ons are:

 fill_value - value to replace missing values with
 dropna - exclude columns whose entries are all NaN
 margins - add all row/columns (subtotal / grand totals)
 sort - sort the results
Pivot Table with Mul ple aggfunc
We can use mul ple aggrega on func ons. The func ons might be different for different columns:
 'D' - mean
 'E' - min and max

Output:

Pivot table replace NaN

To replace NaN values in the pivot table we can use the parameter ﬁll_value. We can replace NaN
values with 0 by:

Pivot table remove NaN

To drop columns with NaN values we can use op on dropna=True:

For Assignment-3 (Final - Pandas - Lab)
No ratings yet
For Assignment-3 (Final - Pandas - Lab)
40 pages
Pandas
No ratings yet
Pandas
41 pages
Pandas-Creating Series & Dataframes (DR V Gowri, Srmist)
No ratings yet
Pandas-Creating Series & Dataframes (DR V Gowri, Srmist)
47 pages
Introduction to Python Pandas Library
No ratings yet
Introduction to Python Pandas Library
22 pages
PANDAS Cheatsheet
No ratings yet
PANDAS Cheatsheet
4 pages
IP TERM-1 Study Material (Session 2021-22)
No ratings yet
IP TERM-1 Study Material (Session 2021-22)
84 pages
Pandas 6 1716219621
No ratings yet
Pandas 6 1716219621
17 pages
18 Pandas
No ratings yet
18 Pandas
33 pages
Pandas
No ratings yet
Pandas
8 pages
Pandas Guide for Data Analysts
No ratings yet
Pandas Guide for Data Analysts
9 pages
Pandas Notes Design
No ratings yet
Pandas Notes Design
5 pages
CLS - Xii - Ip - Practical & Project - 2022-23
No ratings yet
CLS - Xii - Ip - Practical & Project - 2022-23
6 pages
DevOps Session 3 Pandas
No ratings yet
DevOps Session 3 Pandas
33 pages
Pandas Data Handling for Class 12 IP
No ratings yet
Pandas Data Handling for Class 12 IP
54 pages
Pandas
No ratings yet
Pandas
13 pages
ML Lab1 Python Panda
No ratings yet
ML Lab1 Python Panda
9 pages
Unit-1 Python Pandas
No ratings yet
Unit-1 Python Pandas
56 pages
Data Visualization with DataFrame Guide
No ratings yet
Data Visualization with DataFrame Guide
38 pages
Unit3 Python
No ratings yet
Unit3 Python
11 pages
Interview Bit Pandas
No ratings yet
Interview Bit Pandas
62 pages
XII IP Class Test Pandas
No ratings yet
XII IP Class Test Pandas
6 pages
XII-IP - Data Visualisation
No ratings yet
XII-IP - Data Visualisation
65 pages
Using Pandas for Data Analysis in Python
No ratings yet
Using Pandas for Data Analysis in Python
8 pages
Pandas Guide for Beginners
No ratings yet
Pandas Guide for Beginners
18 pages
Pandas Notes
No ratings yet
Pandas Notes
6 pages
Pandas
No ratings yet
Pandas
14 pages
Module1-Cheat-Sheet-LINE PLOT
No ratings yet
Module1-Cheat-Sheet-LINE PLOT
3 pages
Pandas
No ratings yet
Pandas
30 pages
Python 2.1.3
No ratings yet
Python 2.1.3
6 pages
Chapter 2 Data Handling Using Pandas - I (Series)
0% (1)
Chapter 2 Data Handling Using Pandas - I (Series)
13 pages
Class XII Pandas & SQL Practical List
100% (1)
Class XII Pandas & SQL Practical List
7 pages
Pandas DataFrame Basics Guide
No ratings yet
Pandas DataFrame Basics Guide
4 pages
Basics of Python Programming and Statistics
No ratings yet
Basics of Python Programming and Statistics
56 pages
Python Cheat-Sheet
No ratings yet
Python Cheat-Sheet
3 pages
Pandas Methods
No ratings yet
Pandas Methods
6 pages
Class XII IP Student Supporting Material-Final
No ratings yet
Class XII IP Student Supporting Material-Final
112 pages
Class XII Python Pandas Study Material
No ratings yet
Class XII Python Pandas Study Material
180 pages
Pandas Data Wrangling Cheatsheet Datacamp PDF
No ratings yet
Pandas Data Wrangling Cheatsheet Datacamp PDF
1 page
LMRS Ip 2020 21
No ratings yet
LMRS Ip 2020 21
21 pages
Pandas
No ratings yet
Pandas
4 pages
Pandas Notes Basic To Advance
No ratings yet
Pandas Notes Basic To Advance
21 pages
Data Science - Unit-3-Part-2
No ratings yet
Data Science - Unit-3-Part-2
32 pages
Python Notes For Beginners (Autosaved)
No ratings yet
Python Notes For Beginners (Autosaved)
52 pages
Research Paper Presentation Pandas Moshiul Arefin
No ratings yet
Research Paper Presentation Pandas Moshiul Arefin
30 pages
Unit - 1 - Python Pandas
No ratings yet
Unit - 1 - Python Pandas
176 pages
International Indian School, Riyadh WORKSHEET (2020-2021) Grade - Xii - Informatics Practices - Second Term
No ratings yet
International Indian School, Riyadh WORKSHEET (2020-2021) Grade - Xii - Informatics Practices - Second Term
9 pages
Python Tkinter for Beginners
No ratings yet
Python Tkinter for Beginners
45 pages
Security Data Visualization Graphical Techniques F... - (4 Vulnerability Assessment and Exploitation)
No ratings yet
Security Data Visualization Graphical Techniques F... - (4 Vulnerability Assessment and Exploitation)
24 pages
Pandas in Py: A Detailed Overview Into Series and Dataframe Functions in Pandas
No ratings yet
Pandas in Py: A Detailed Overview Into Series and Dataframe Functions in Pandas
21 pages
Chapter 1 Review of Python Basicseng PDF
No ratings yet
Chapter 1 Review of Python Basicseng PDF
51 pages
Filtering Data in Pandas with Python
No ratings yet
Filtering Data in Pandas with Python
52 pages
Iloc and Loc Uses PDF
No ratings yet
Iloc and Loc Uses PDF
16 pages
Python Pandas: Data Manipulation Guide
No ratings yet
Python Pandas: Data Manipulation Guide
84 pages
Data Handling with Pandas: Series & DataFrame
No ratings yet
Data Handling with Pandas: Series & DataFrame
44 pages
Subject IP
No ratings yet
Subject IP
9 pages
Panda
No ratings yet
Panda
46 pages
Pandas Class 12 Ncertttt
No ratings yet
Pandas Class 12 Ncertttt
48 pages
Data Handlinng Using Pandas
No ratings yet
Data Handlinng Using Pandas
46 pages
Pandas DataFrame Basics Guide
No ratings yet
Pandas DataFrame Basics Guide
9 pages
Pandas
No ratings yet
Pandas
7 pages
Practice Numpyarray
No ratings yet
Practice Numpyarray
15 pages
Introduction to Pandas DataFrames
No ratings yet
Introduction to Pandas DataFrames
25 pages
PS 2-1 AY 2024 2025 Session Projects - 2-1 Finalised Projects
No ratings yet
PS 2-1 AY 2024 2025 Session Projects - 2-1 Finalised Projects
2 pages
Post-Matric Scholarship for BC Students
No ratings yet
Post-Matric Scholarship for BC Students
1 page
NumPy Array Operations Guide
100% (1)
NumPy Array Operations Guide
73 pages
Orthanc Paper
No ratings yet
Orthanc Paper
4 pages
New Amazaon Research
No ratings yet
New Amazaon Research
7 pages
3 Semaphores
No ratings yet
3 Semaphores
26 pages
Avionics Certificate
No ratings yet
Avionics Certificate
3 pages
Information Technology (402) Class X (P1)
100% (1)
Information Technology (402) Class X (P1)
8 pages
Resumeintern 1
No ratings yet
Resumeintern 1
2 pages
Projects of Microsoft Word.
No ratings yet
Projects of Microsoft Word.
2 pages
JMP For Chemistry and Chemical Engineering
No ratings yet
JMP For Chemistry and Chemical Engineering
2 pages
Instances: Aion 5.6 - Omens of Ice Patch Notes
No ratings yet
Instances: Aion 5.6 - Omens of Ice Patch Notes
21 pages
Phase2 Evaluation F24
No ratings yet
Phase2 Evaluation F24
7 pages
Ipc Law Book in Tamil PDF Download
100% (5)
Ipc Law Book in Tamil PDF Download
2 pages
Education / Summary of Expertise: Lord Jerome Roque Fernandez
No ratings yet
Education / Summary of Expertise: Lord Jerome Roque Fernandez
2 pages
Pandas - Series - Introduction
No ratings yet
Pandas - Series - Introduction
19 pages
Asset Classification & Access Report
No ratings yet
Asset Classification & Access Report
4 pages
Sage Green Aesthetic Wallpaper Photo Gallery - Google Search
No ratings yet
Sage Green Aesthetic Wallpaper Photo Gallery - Google Search
1 page
HTWT415073 D
No ratings yet
HTWT415073 D
3 pages
3BSE056141-510 C en System 800xa 5.1 Server Node Virtualization
No ratings yet
3BSE056141-510 C en System 800xa 5.1 Server Node Virtualization
136 pages
Online Bookstore Management System Project
No ratings yet
Online Bookstore Management System Project
28 pages
Manual Linux Mint
100% (1)
Manual Linux Mint
32 pages
SAP Ariba Solutions Roadmap 2020
No ratings yet
SAP Ariba Solutions Roadmap 2020
39 pages
Interview+Questions+ +introduction+ +Data+Cloud
No ratings yet
Interview+Questions+ +introduction+ +Data+Cloud
4 pages
Gcse Examen
No ratings yet
Gcse Examen
24 pages
Computer Fundamentals Overview
No ratings yet
Computer Fundamentals Overview
50 pages
LF ZMB26 PL V0
No ratings yet
LF ZMB26 PL V0
6 pages
ISMS Control Checklist 2022
67% (3)
ISMS Control Checklist 2022
31 pages
Home Service Booking Management System
No ratings yet
Home Service Booking Management System
45 pages
Monitoring, Detection and Classification of Rabbit Livestock Activities Using The Internet of Things (IoT) and Support Vector Machine (SVM)
No ratings yet
Monitoring, Detection and Classification of Rabbit Livestock Activities Using The Internet of Things (IoT) and Support Vector Machine (SVM)
6 pages
Ma Network Automation For Everyone Ebook 511700 202309 en
No ratings yet
Ma Network Automation For Everyone Ebook 511700 202309 en
18 pages
A Comprehensive and Systematic Literature Review On The Big Data
No ratings yet
A Comprehensive and Systematic Literature Review On The Big Data
60 pages
Course Syllabus CC 102
No ratings yet
Course Syllabus CC 102
13 pages